Getting a list of Power BI reports and Data Sources from the Power BI API

In changing our reports to point from an older database server to a newer one, we needed a way to tell which reports we’d migrated so far and which ones still had data connections to the old server. Keeping a list of reports would be the easiest way to do this, but why do that when you can write a query to track it for you!

I came across this post that provides the outline of what I was looking for:

https://community.powerbi.com/t5/Report-Server/Get-all-connection-string-for-reports-from-power-bi-report/td-p/920392

It provided the following M-code, which uses the Power BI API to fetch a list of all reports on the server and all the datasource details (it doesn’t fetch the contents of the actual M-code behind the data sources – that would be pretty amazing). Nonetheless, pretty neat on its own (you need to replace “<YOUR URL>” in a couple of spots):

let

    Source = OData.Feed("https://<YOUR URL>/pbireports/api/v2.0/PowerBIReports", null, [Implementation="2.0"]),

    #"Expanded DataSources" = Table.ExpandTableColumn(Source, "DataSources", {"Id", "ModifiedBy", "ModifiedDate", "ConnectionString", "DataModelDataSource"}, {"DataSources.Id", "DataSources.ModifiedBy", "DataSources.ModifiedDate", "DataSources.ConnectionString", "DataSources.DataModelDataSource"}),

    #"Expanded DataSources.DataModelDataSource" = Table.ExpandRecordColumn(#"Expanded DataSources", "DataSources.DataModelDataSource", {"Type", "Kind", "AuthType", "Username", "ModelConnectionName"}, {"DataSources.DataModelDataSource.Type", "DataSources.DataModelDataSource.Kind", "DataSources.DataModelDataSource.AuthType", "DataSources.DataModelDataSource.Username", "DataSources.DataModelDataSource.ModelConnectionName"}),

    #"Added Conditional Column" = Table.AddColumn(#"Expanded DataSources.DataModelDataSource", "DS.Connection_String", each if [DataSources.ConnectionString] = null then "No Data Source" else [DataSources.ConnectionString]),

    #"Removed Columns" = Table.RemoveColumns(#"Added Conditional Column",{"CacheRefreshPlans", "AccessToken", "Roles", "ContentType", "Content", "ParentFolder", "Properties", "Comments", "AlertSubscriptions", "AllowedActions", "Policies", "DependentItems","Id", "ParentFolderId", "DataSources.Id", "DataSources.DataModelDataSource.ModelConnectionName"}),

    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"DataSources.ConnectionString", "DS.ConnectionString"}, {"DataSources.DataModelDataSource.Type", "DS.Type"}, {"DataSources.DataModelDataSource.Kind", "DS.Kind"}, {"DataSources.DataModelDataSource.AuthType", "DS.AuthType"}, {"DataSources.DataModelDataSource.Username", "DS.Username"}, {"DataSources.ModifiedBy", "DS.ModifiedBy"}, {"DataSources.ModifiedDate", "DS.ModifiedDate"}, {"Name", "Report"}, {"Path", "Report Path"}}),

    #"Add Report URL" = Table.AddColumn(#"Renamed Columns", "Report URL", each "https://<YOUR URL>/PBIReports/powerbi" & [Report Path] & "?rs:embed=true")

in

    #"Add Report URL"

This was a good framework, but I wanted to add a few things and combine multiple different servers together (My example shows a dev, qa, and prod server being combined, but they could be any number of servers or even both Power BI Server and SSRS if you wanted):

The M-code shown earlier became the highlighted item above – I converted it to a function (“PBI Data Sources” at the top of the “PBI Data Sources Function” group) that accepted both a “Server URL” (replacing the “<YOUR URL>” portion in the initial query) and a “Server Zone” allowing you to name the server in the results. Once you have the function, we move up to the last three objects in “Each Server”, which each call the function with different parameters (for each of my server zones), and then combine them into a single dataset (“Combined Data Sources”). Finally, the results are split into a list of Reports and a list of Data Sources that you can pull into your Power BI model.

Once of the issues I ran into was that I didn’t have access to all the reports on my server – those broke my dataset with the following message:

DataSource.Error: ODate Request failed: The remote server returned an error (500) Internal Server Error

Interestingly, it wasn’t a “Permission Denied” error, but a 500 server error. I got around it by adding some code to drop any rows where I received an error by using the “try otherwise” error handing in M, something I’d never used before:

= Table.SelectRows(#"Sorted Rows", each ((try [DataSources]{0}[Id] otherwise null) <> null))

This code checks to see if it can view the very first value within the “DataSources” field, and if it can’t, it returns null and then the row gets filtered. This isn’t the most elegant way to do this – it drops any reports I don’t have access too, rather than calling them out, but it got the job done and allowed my dataset to move forward without errors.

Once the data is in the model, you can use some simple visualizations to see what types of connections you have, where you’re connecting, and what user those connections are set up to use (so you can see if every one of your connections is using the correct proxy users).

I hope this helps you get a handle on all the data sources you’re using, and make sure that everything is using the correct settings. I’ve included the empty PBIT file at the end of this post for anybody that wants to connect to their data and see what it looks like (when you open the file, it prompts you for two variables – you can put whatever you want there, but it’s because I’ve left the source query in the file).

If you do end up using this to create something, please let me know what as I’d love to see it in action!

Download the PBIT file

Query Business Objects repository for list of reports

We’re migrating away from Business Objects Enterprise (XI-R2, not the newer 3.X version) and into SQL Reporting Services (it’s about time!) and I needed an inventory of everything that was still in BOE. The repository is obfuscated so that it’s not human readable, but thanks for some sleuthing and an 8-year-old post from a Business Objects forum, I wanted to share the script I was able to come up with.

The hard part (decoding the obfuscated text from the database into human-readable format) came from a post by “marklee” on a Business Objects board (his post is about 1/4 of the way down):

http://www.forumtopics.com/busobj/viewtopic.php?t=124272

His query was also written for Oracle, so I converted it to one that would run against SQL Server, and then I added some additional content types based on what we had in our database, and then attached a few additional rollups (like showing the root folder, building out the full path to the report, and returning the number of instances of each report that had been generated.

You can download the full script if you’d like, or you can view the contents below. Enjoy!

select 
 objectid, 
 parentid, 
 ownerid, 
 convert(datetime, REPLACE(left(lastmodifytime, 10),' ','-') + ' ' +
			replace(substring(lastmodifytime, 12, 8), ' ', ':') +
			'.' + substring(lastmodifytime, 21, 3)) as lastmodifytime, 
 upper(left(objectname,len(objectname)-2)) ObjectName, 
 typeid, 
 type, 
 convert(bit, 0) as IsInstance,
 si_cuid 
 into #results
 from 
 ( 
 SELECT 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 replace( 
 cast(objname as varchar(2000)) 
 ,'S','v') 
 ,'M','s') 
 ,'A','m') 
 ,')','a') 
 ,'+','b') 
 ,'C','n') 
 ,'-','c') 
 ,'/','d') 
 ,'O','t') 
 ,'E','o') 
 ,'1','e') 
 ,'3','f') 
 ,'G','p') 
 ,'5','g') 
 ,'7','h') 
 ,'W','x') 
 ,'U','w') 
 ,'Q','u') 
 ,'I','q') 
 ,'9','i') 
 ,':','i') 
 ,';','j') 
 ,'K','r') 
 ,'=','k') 
 ,'?','l') 
 ,'[','y') 
 ,']','z') 
 ,'!@',' ') 
 ,'B~S','&amp;') 
 ,'!BO','.') 
 ,'B|C&quot;','(') 
 ,'!B|D',')') 
 ,'M|Z','-') 
 ,'M}L',',') 
 ,'M|N','_') 
 ,'M}Z',':') 
 ,'!B{B','''') 
 ,'`|&lt;','0') 
 ,'`|&gt;','1') 
 ,'`|@','2') 
 ,'`|B','3') 
 ,'`|D','4') 
 ,'`|F','5') 
 ,'`|H','6') 
 ,'`|J','7') 
 ,'`|L','8') 
 ,'`|N','9') 
 ,'{','') 
 ,'!','') 
 ,'&quot;','') 
 ,'@','') 
 ObjectName, 
 convert(varchar(100), case 
	 when TypeID = 1 then 'Folder' 

	 when TypeID = 2 then 'Crystal Report'
	 when TypeID = 260 then 'Web Intelligence Document'
	 when TypeID = 290 then 'Object Package' 
	 when TypeID = 265 then 'Report Instance (TXT)'
	 when TypeID = 262 then 'Webi Report' 
	 when TypeID = 314 then 'Deski Report' 

	 when TypeID = 283 then 'PDF' 
	 when TypeID = 267 then 'Text' 
	 when TypeID = 323 then 'Excel' 
	 when TypeID = 319 then 'Excel (XLS)'
	 when TypeID = 259 then 'Word (DOC)'
	 when TypeID = 279 then 'Report Instance (PDF)'

	 when TypeID = 266 then 'Universe' 
	 when TypeID = 278 then 'Publication' 
	 when TypeID = 299 then 'Connection' 
	 when TypeID = 8  then 'Shortcut' 
	 when TypeID = 20 then 'Groups' 
	 when TypeID = 13 then 'Server' 
	 when TypeID = 16 then 'BO Server' 
	 when TypeID = 21 then 'Event' 
	 when TypeID = 24 then 'License Key' 

	 when TypeID = 19 then 'User type 19' 
	 when TypeID = 18 then 'User type 18' 
	 when TypeID = 47 then 'User type 47' 
	 when TypeID = 48 then 'User type 48' 
 
	 else 'Other' 
 end) Type, 
 * 
 FROM [dbo].[CMS_InfoObjects5] 
 where typeid &lt;&gt; 284 -- appears to be some kind of temporary file
 ) BORepository 

create clustered index #idx_c_results on #results (objectid)
create index #idx_results_parentid on #results (parentid)
create index #idx_results_ownerid on #results (ownerid)

-- Mark the ones that are instances of another object (children of non-folder)
update r
   set r.IsInstance = 1
 from #results r
join #results r2
  on r.ParentID = r2.ObjectID
where r2.TypeID &lt;&gt; 1
  and r.TypeID &lt;&gt; 1




-- Define all the levels of the objects and add them to a lookup table
DECLARE @ObjectLevel INT = 0

select 23 as objectid, 0 as parentid, 0 as ownerid, GETDATE() as lastmodifytime,
		convert(varchar(100), '$') as objectname, convert(int, null) as ChildInstances, convert(datetime, null) as LastInstanceTime,
		1 as typeid, convert(varchar(100), 'Root Folder') as type, @ObjectLevel as objectlevel 
into #objects

while @@rowcount &gt; 0
begin
	;with p as (select parentid, count(*) as instances, max(lastmodifytime) as LastInstanceTime from #results group by parentid)
	update #objects
	   set ChildInstances = p.instances,
		   LastInstanceTime = p.LastInstanceTime
	  from #objects o
	  join p
	    on o.objectid = p.ParentID
	 where o.ChildInstances is null
	   and o.typeid &lt;&gt; 1
	
	SET @ObjectLevel = @ObjectLevel + 1

	insert into #objects
	select r.objectid, r.parentid, r.ownerid, r.lastmodifytime, r.objectname, null, null, r.typeid, r.type, @ObjectLevel as objectlevel
	from #results r
	join #objects o
      on r.parentid = o.objectid
	 and o.typeid = 1
	left
	join #objects o2
	  on r.objectid = o2.objectid
   where o2.objectid is null

end


-- Set up the object paths
select objectid, parentid, lastmodifytime, convert(varchar(4000), '') as ObjectPath,
		objectname, childinstances, lastinstancetime, typeid, type
  into #objectswithpath
  from #objects
 where typeid &lt;&gt; 1

while @@rowcount &gt; 0
BEGIN
	update #objectswithpath
	   set parentid = o.parentid,
			objectpath = o.objectname + '\' + objectpath
	  from #objectswithpath owp
	  join #objects o
	    on owp.parentid = o.objectid
END

select objectpath, objectname, lastmodifytime, ChildInstances, LastInstanceTime, type
 from #objectswithpath
order by ObjectPath, objectname


--select * from #results
--select typeid, type, count(*) from #objects group by typeid, type order by count(*) DESC

/*
select type, case when isnull(ChildInstances,0) = 0 then 0 else 1 end, count(*), max(LastInstanceTime) --objectpath, objectname, lastmodifytime, ChildInstances, LastInstanceTime, type
 from #objectswithpath
 where ObjectPath not like '$\REPORTS OFF LINE\%'
group by type, case when isnull(ChildInstances,0) = 0 then 0 else 1 end
order by type, case when isnull(ChildInstances,0) = 0 then 0 else 1 end
*/

--select * from #objectswithpath where type = 'Word (DOC)'


drop table #objectswithpath
drop table #objects
drop table #results

SQL Agent job fails with QUOTED_INDENTIFIER error

If you have a SQL Agent job that’s failing, but the query succeeds when you run it manually in SQL Management Studio, it’s possibly a difference between the SQL Server default connection settings and those SSMS uses as the default. In my case, I was inserting from a stored procedure and received the following error:

[SQLSTATE 42000] (Error 1934) INSERT failed because the following SET options have incorrect settings: ‘QUOTED_IDENTIFIER’. Verify that SET options are correct for use with indexed views and/or indexes on computed columns and/or filtered indexes and/or query notifications and/or XML data type methods and/or spatial index operations.

To view the default connection options for your SQL Server instance, right-click on your instance in SSMS and select “Properties”, and then browse to “Connections” – you’ll see something like the following list (on my server, none of these are enabled by default for new connections):

Server Defaults

To compare it to the default settings for your installation of SSMS, click on “Tools” and then “Options…”, and then browse to “Query Execution” -> “SQL Server” and view the settings under both “Advanced” and ANSI” (In my case, SSMS was enabling a number of settings by default for my connections that SQL Agent jobs wouldn’t have enabled):

SSMS Results ANSI

SSMS Results Advanced

In my case, I just added an explicit “SET QUOTED_IDENTIFIER ON” to the beginning of the script in my SQL Agent job and it resolved the error message. Hope this helps you too!

Querying Active Directory from SQL Server

SQL Server provides some pretty flexible integration with Active Directory through the ADSI Linked Server provider, something that’s present by default when you install SQL Server. If you’ve never used it before, it allows you to connect to a domain controller and query AD the same way you’d query any other linked server. For example, it gives you the option to:

  • Identify when logins to SQL Servers or databases that support financial applications exist, but have no matching AD account (either direct integrated logins, or if SQL logins or rows in a “User” table have been set up to match the AD login)
  • Kick off alerts to provision the user in various systems based on their AD group membership
  • Automatically trigger an action when a new account appears in active directory (for example, we auto-provision security badges and send an email alert to our head of security to assign the appropriate rights)

While much of this could also be done from Powershell as well, we use the SQL Server Agent to manage many of our scheduled job (because it’s so handy to have the agent remotely accessible), as well as sometimes just needing data from AD in a query. To support a number of processes we have in place, we run a synchronization job every so often throughout the day that pulls about two dozen fields for all users and synchronizes them into a table if anything has changed.

Setting up the linked server itself is pretty straightforward (courtesy of http://community.spiceworks.com/how_to/show/27494-create-a-sql-linked-server-to-adsi):

  1. Create the linked server itself
  2. Set the security context (if you want to query AD as something other than the SQL Server Service account – by default, all domain users can do this and it’s only required if the domain is remote or if, for some reason, your SQL Service account’s AD rights have been restricted, like if you’re running as “LOCAL SERVICE”)
  3. Enable OPENQUERY (Ad Hoc Distributed Queries)

You’ll notice that setting up the linked server itself doesn’t actually specify where Active Directory is located or what domain/forest you’ll be querying – that’s actually done in the query itself. In each query, you’ll need to specify the FQDN (Fully-qualified domain name) of the domain (or OU) of the domain you’re querying. For example, we’d get all users from a domain by issuing the following query (in this example, “ADLinkedServerName” is the linked server we just created, and our domain is “corp.mycompany.local”):

SELECT EmployeeNumber, Name AS FullName, givenName as FirstName, sn as LastName,
L AS Location, samAccountName as ADAccount
FROM OPENQUERY(ADLinkedServerName,'SELECT Name, L, givenName, sn,
EmployeeNumber, EmployeeID,samAccountName,createtimestamp
FROM ''LDAP://OU=Users,DC=corp,DC=mycompany,DC=local''
WHERE objectClass =''user''') ad

This query will search that OU (“Users”, in this case) and everything below it, so changing the FROM to “LDAP://DC=corp,DC=mycompany,DC=local” would fetch the entire directory (for all the “user” objects), regardless of what folder they appeared it – if your directory puts users in another OU (like “Associates”, for example), you should adjust the query accordingly.

For column names, you can pull any AD properties at all that you’re looking for – even custom ones that aren’t part of a standard AD configuration. To get an easy list of AD properties to choose from, I like using ADSIEDIT (part of Microsoft’s Remote Server Administration Tools – download RSAT for Windows 7 or RSAT for Windows 8.1) – just drill down all the way down to an object, like a user, right click on them and select “Properties”, and you can see a list of all the properties on that account. If you’ve got Domain Admin rights, this tool can be used to modify these values too, but for querying, you only need to be a domain user or somebody who has rights to browse AD. Make a note of the names of particular properties that you’re interested in – also note that AD queries are case-sensitive, so you’ll need to note the casing of these properties as well.

One potential gotcha that I’ve run into is that maximum result size that AD will return in a single query can be set as part of domain policy – by default it’s 1000 records at once, and can be configured by setting or adjusting the “PageSize” property on your domain controllers (see https://support.microsoft.com/kb/315071/en-us). Also, there’s a “MaxResultSetSize” property as well that’s set to 256KB by default, but I’ve never hit it – unless you’re pulling every single property back, you’d likely hit the PageSize row limit before you hit the ResultSize byte limit, but remember that both are there. If you do hit the AD result count limit, it will return the rows up to the limit, but then execution stops with a kind of cryptic error:

Msg 7330, Level 16, State 2, Line 2
Cannot fetch a row from OLE DB provider &quot;ADsDSOObject&quot; for linked server &quot;YOURDOMAIN&quot;.

If your domain is larger than the PageSize limit, you’ll need to cut your query into multiple return sets of data so you don’t exceed the limit on any single query. Since our domain contains about 2400 users, we were able to do it in two queries, broken up like this:

SELECT samAccountName
  FROM OPENQUERY(ADLinkedServerName,'SELECT samAccountName
                                       FROM ''LDAP://OU=Users,DC=corp,DC=mycompany,DC=local''
                                      WHERE objectClass =''user''
                                        AND givenName&lt;''L''') as c
UNION ALL
SELECT samAccountName
  FROM OPENQUERY(ADLinkedServerName,'SELECT samAccountName
                                       FROM ''LDAP://OU=Users,DC=corp,DC=mycompany,DC=local''
                                      WHERE objectClass =''user''
                                        AND givenName&gt;=''L''') as c

By dividing the names on L, this cut the directory roughly in half – if yours was larger, you could divide it by querying each OU separately, or by looping through letters of the alphabet, or whatever makes sense in your setting. You could even do something dynamic like pull as many records as you can, then grab the value from the last record you pulled and use it as the baseline to pull the next set as far as you can, and then repeat until you run out of records. Linked servers don’t allow you to dynamically assemble your query at run-time – it has to be hard-coded in the query – but there are some ways around that (like building your OPENQUERY as a string and then executing it via sp_executesql, for example).

Now that you have your AD records stored in a temp table, you can identify new/changed records and merge them into a SQL table you already have ready using an INSERT/UPDATE/DELETE or MERGE statement, or possibly trigger notifications or some other business process.

I hope this is helpful – if you’d like some more detail, please leave a comment and I’m happy to elaborate where it’s necessary!

Removing expired/unused SSRS subscriptions

SQL Reporting Services doesn’t do a very good job keeping the SQL Agent clean by removing expired or otherwise unusable subscriptions from the job list. To deal with this, we created a script that pulls some details about these old subscriptions, including the report responsible, the last run time and status, and the user who originally scheduled it. If you notice your SQL Agent job list getting excessively long, you can use this query to identify the culprit reports and owners, and then either notify them or remove the old subscriptions manually yourself (run this on the server with your SSRS databases):

  select c.Name as ReportName,
         s.EventType,
         s.Description as SubscriptionDescription,
         s.LastStatus as LastSubscriptionStatus,
         s.LastRunTime SubscriptionLastRunTime,
         case
            when recurrencetype = 1 then 'One Time'
            when recurrencetype = 2 then 'Hourly'
            when recurrencetype = 4 then 'Daily'
            when recurrencetype = 5 then 'Monthly'
            when recurrencetype = 6 then 'Month Week'
            else 'Other'
         end as RecurranceType,
         s.DeliveryExtension,
         u.UserName as SubscriptionSetUpBy,
         s.ModifiedDate as SubscriptionLastModifiedDate
    from [ReportServer].[dbo].[Subscriptions] s
    join [ReportServer].[dbo].[Catalog] c
      on c.ItemID = s.Report_OID
    join [ReportServer].[dbo].[Users] u
      on u.UserID = s.OwnerID
    join [ReportServer].[dbo].[reportschedule] rs
      on c.itemid = rs.reportid
     and s.subscriptionid = rs.subscriptionid
    join [ReportServer].[dbo].[schedule] sch
      on rs.scheduleid = sch.scheduleid
   where s.EventType &amp;amp;lt;&amp;amp;gt; 'RefreshCache'
     and s.LastRunTime &amp;amp;lt; dateadd(m, -3, getdate())
order by c.name

There are a number of similar scripts out there that pull much of this information together, but there wasn’t one that collected all the details we were looking for in one place. From here, you can deal with the subscriptions as you see fit.

Note that you can just remove the old subscriptions by brute force if you’d prefer, and SSRS will clean up the orphaned SQL jobs, but I’ve preferred to review the list and notify users as we’ve never had too much volume to deal with. If you want to just delete them straight away, you can do so here:

DELETE ReportServer.dbo.Subscriptions
WHERE InactiveFlags != 0
	OR LastRunTime &amp;lt; dateadd(m, -3, getdate())