Skip to main content

Sharepoint Online

Example Setup

In the steps below you will find the minimal steps required in order to run Cognite File Extractor towards Sharepoint Online.

Note

The steps described here are just an example on how to setup the File Extractor to connect to Sharepoint Online sources. The configuration shared above might change depending on each use case (permissions, restrict access on Sharepoint level, etc)

Create an App Registration

  1. Go to https://portal.azure.com/#home and login using your Microsoft 365 account (the same you use to login to Sharepoint Online).
  2. Go to “App Registrations” and create a new app registration which will be used by the Cognite File Extractor in order to connect to Sharepoint Online. Read more.
  3. After creating your App Registration, you must assign relevant API permissions in order to allow the usage of Microsoft Graph API. Below is an example of a minimal setup in order to allow the File Extractor to read all Sharepoint Online sites.
Sites.Read.All (Type: Application)
User.Read (Type: Delegated)

After finishing the steps metioned above, the API permissions will look like this:

App Registration API permissions

Sharepoint App permission

After creating the App Registration, the same must be added to the Sharepoint administration app registration.

  1. Go to the following URL in order to create a new sharepoint app permission: https://YOUR-SHAREPOINT-NAME-admin.sharepoint.com/_layouts/15/AppInv.aspx. Make sure to use your Microsoft 365 admin account.
  2. Set the “App Id” value to the app registration Client ID and click on “Lookup”. The “Lookup” will retrieve and fill up the information with the App registration previously created.
Sharepoint App permission
  1. Set the “App Domain” and “Redirect URL” to your App Registration. For a local execution of the File Extractor, you can set the values to a localhost domain (see below):
App Domain: www.localhost.com
Redirect URL: https://www.localhost.com/default.aspx
  1. Add the App Permission Request XML in order to configure the Sharepoint permission level. This may very from every configuration scenario. Below you will find a minimal XML in with Read permissions to all Sharepoint sites.
<AppPermissionRequests AllowAppOnlyPolicy="true">  
<AppPermissionRequest Scope="http://sharepoint/content/sitecollection"
Right=“Read” />
</AppPermissionRequests
Note

The configuration shared above might change depending on each use case (permissions, restrict access on Sharepoint level, etc)

  1. Click on "Create"
  2. Sharepoint will ask for a final confirmation after creating the app registration. Click on "Trust It"
Trust Sharepoint app

Run the extractor

After executing the previous steps, you are ready to extract files from Sharepoint Online.

  1. Download the Cognite File Extractor from Cognite Data Fusion "Extract Data" page
  2. Modify the “example-sharepoint.yaml” configuration template, setting the App Registration information and the related Sharepoint configuration parameters.
  3. Run the extractor. Below you will find a successful execution log example
2023-11-21 09:06:16.564 UTC [INFO    ] ThreadPoolExecutor-0_0 - All files processed. 5 files uploaded
2023-11-21 09:06:16.564 UTC [INFO ] ThreadPoolExecutor-0_0 - Job "FileExtractor (trigger: interval[0:00:10], next run at: 2023-11-21 10:06:20 CET)" executed successfully
2023-11-21 09:06:20.207 UTC [INFO ] ThreadPoolExecutor-0_0 - Running job "FileExtractor (trigger: interval[0:00:10], next run at: 2023-11-21 10:06:30 CET)" (scheduled at 2023-11-21 10:06:20.184848+01:00)

  1. Check the extracted files in Data Explorer.

Finding the right paths in Sharepoint Online

The Cognite File Extractor can be configured to extract from a variety of paths in Sharepoint Online. It is possible to provide paths to sites, document libraries, folders and even directly to individual files.

This guide will help you find the correct paths to use for any of the above options.

Paths to Sites

Paths to sites can usually be picked right out of the URL bar in your browser. Just navigate to your Sharepoint Online site, select the URL in the URL bar, and copy-paste it into your configuration.

The below screenshot shows what the URL bar should look like when you are looking at the main page of your site.

Copying the URL from the screenshot above will give you the following site URL:

https://cogsp.sharepoint.com/sites/MySite
info

URLs with /teams/ in the middle, instead of /sites/, are also valid.

Paths to Document Libraries

Sites in Sharepoint Online can contain one ore more Document Libraries, which is where you store your files. New Sites tend to have only a Document Library called Documents, but you are free to create any number of Document Libraries.

If you want to extract only a single Document Library from a Site, you will have to find the URL for this Document Library. Start by navigating to the Document Library you want to extract. The URL in the URL bar should look like the URL for a site, with some extra stuff at the end. See the screenshot below.

Copy the start of the URL up until /Forms/AllItems.aspx. For the screenshot above, that would leave us with the following url:

https://cogsp.sharepoint.com/sites/MySite/Shared%20Documents

Paths to Files and Folders

Paths to files and folders can be found in the same way in the Sharepoint Online UI. Find the file or folder you want to have the path for, and select Details from the drop-down menu.

This will open a sidebar with details for the file or folder. In the sidebar, scroll down until you see the section called Path.

Simply click on the Copy direct link icon next to Path, and that will put the correct url on your clipboard. The resulting url for the screenshot above would be:

https://cogsp.sharepoint.com/sites/MySite/Shared%20Documents/my%20folder/test.doc

As you can see, it's the same url as for the Document Library, with the path to the file/folder added to the end.