Web Data Connection Wizard

Web Data Connection Wizard

The SaveToDB add-in for Microsoft Excel allows getting data from web data services and saving data changes back using REST API, including OData.

Also, the SaveToDB add-in allows extracting data from HTML, XML, JSON, and CSV web pages. It supports merging loaded data into databases.

SaveToDB supports the following authorization methods: Windows, Basic, Forms, OAuth 1.0, and OAuth 2.0.

Below is the main dialog box:

SaveToDB Web Data Connection Wizard

Dialog Box Controls

URL
Use this field to input a URL.
In general, you can insert it from a browser.
SaveToDB allows configuring URLs. See below.
When you change the URL, the add-in tries to discover the other required parameters in the background.
If the add-in cannot discover the parameters or you need to use other values, specify them manually.
Service URL
Use this field to input the root address of a website or service.
The service URLs are defined unambiguously for OData web services only. The service URL is the service root URL that returns the service document.
The service URL for other web services and websites is a URL of any protected area.
A website can contain multiple protected areas. The root URLs of these areas can be used as service URLs.
In most cases, the add-in detects the root address automatically for all web resources.
Don't worry if the service URL is not defined correctly. In this case, you can be asked for authorization for multiple URLs from the website.
Authorization
Use this field to choose an authorization method or a configured provider.
The SaveToDB add-in detects such methods or providers successfully in most cases automatically.
Edit button (...)
Use this button to open the dialog box to configure OAuth1, OAuth2, or custom authorization settings, depending on the authorization method you selected.
Save in Cache
Check this field to store the authorization settings in the encrypted cache file.
Use this option to share your authorization settings across multiple Excel workbooks.
Otherwise, the add-in saves the settings in the active workbook only.
Accept
Use this optional field to set the Accept header.
Try setting the field manually if the webserver returns an error when querying with the default header.
Referrer
Use this optional field to set the Referrer header.
Try setting the value to the website page URL if the webserver blocks the request, like MSN Money.
UserAgent
Use this optional field to set the UserAgent header.
Parameters
Use this optional field to set parser parameters.
Configure REST
Use this button to open the REST Settings dialog box.
Open Cache...
Click this button to open the connection string cache.
You can edit, add and delete cached connection strings.
Note that the add-in encrypts connection strings using Microsoft Cryptography API.
Test Connection
Use this button to test the connection.
The add-in performs such checks in the background. However, it does not display error messages.
This button connects and displays a message if an error occurs.
Help
Use this button to open the user guide.
Cancel
Use this button to close the dialog box.
Back
Use this button to go to the previous step of the wizard.
Next
Use this button to go to the next step of the wizard.
The wizard contains additional steps if the web service supports selecting objects, like OData.
Finish
Use this button to finish the wizard and insert a query into Microsoft Excel.
The button is available when the add-in has successfully connected.
To understand the connection error, if the button is unable, click the Test Connection button.

Configuring URLs

The SaveToDB add-in automatically detects HTTP request parameters and places them on the ribbon.

For example, the following query has parameters like symbols, chartType, isEOD, isCS, and isVol.

https://finance-services.msn.com/Market.svc/ChartAndQuotes?symbols=AAPL&chartType=1y&isEOD=True&isCS=true&isVol=true

The add-in allows you to override parameters in the {<Parameter name>=<Default value>} format.

For example, for the above query, you can use a single Symbol parameter to input a ticker only:

https://finance-services.msn.com/Market.svc/ChartAndQuotes?symbols={Symbol=AAPL}&chartType=1y&isEOD=True&isCS=true&isVol=true

REST API-based web services return regular results that are easy to extract in automatic mode.

For other web pages and undocumented web services, the add-in uses smart algorithms to extract data.

You can configure the output using the special parser parameters described below.

For the example above, the data are in the Chart.Series JSON node.

Therefore, the URL may contain the rootpath=Chart.Series parameter separated by a semicolon:

https://finance-services.msn.com/Market.svc/ChartAndQuotes?symbols={Symbol=AAPL}&chartType=1y&isEOD=True&isCS=true&isVol=true;rootpath=Chart.Series

If you try this request, you will get historical stock prices from the MSN Money website.

However, you will get columns from the JSON response: T, Op, Hp, Lp, P, and V.

SaveToDB 10 allows using SELECT commands for HTTP requests and supports name aliases.

For example, a query might look like this (the request URL is in square brackets):

SELECT T AS Date, Op AS Open, Hp AS High, Lp AS Low, P AS Close, V AS Volume FROM [https://finance-services.msn.com/Market.svc/ChartAndQuotes?symbols={Symbol=AAPL}&chartType=1y&isEOD=True&isCS=true&isVol=true;rootpath=Chart.Series]

As a result, you will get a table with columns like Date, Open, High, Low, Close, and Volume and the Symbol parameter on the ribbon to change the ticker.

As mentioned above, you can save downloaded web data to databases, including using SaveToDB methods with macros.

You can also apply Excel features like inserting and adding new formula columns, sorting, conditional formatting, and so on.

Selecting OData Object

When connecting to OData Web services, the wizard contains an additional step to select an object.

SaveToDB Web Data Connection Wizard - OData Sample

Connecting Excel to OData - Selecting EntitySet or FunctionImport

OData containers are similar to database query list views.

You can use the query list to select a container or an object schema and then a container object to connect.

Specifying Parameters

If the query contains parameters, including FunctionImport parameters for OData, the wizard offers an additional step.

For example:

Connecting Excel to OData - Specifying OData query parameters

You can enter parameter values and check the parameters you want to place on the ribbon in the right column.

Parser Parameters

The SaveToDB add-in extracts data from HTML, XML, JSON, CSV, and plain text pages using built-in smart parsers.

You can tune parsers using the following parameters. Separate parameters by semicolons.

AddRowNum=true

Use this boolean option to add the row number column to the output.

ApiResult=true

Use this boolean option when a web service returns REST API JSON results.

This option disables special smart parsers used to extract table data from irregular JSON.

AsIs=true

Use this boolean option to suppress any special processing of the input data.

For example, you can suppress converting Yahoo timestamps to datetime values.

CapitalLetter=true

Use this boolean option to capitalize header first letters (JSON only).

CollapsedNodes=<node>[,...]

Use this option to include XML child node values into the parent node.

For example, an XML node has the following structure:

<passages>
<passage><hlword>Yandex</hlword> - the search engine that can...</passage>
</passages>

To have a single column (passages), use:

collapsedNodes=passages

FirstRow=<number>

Use this option to set the first row of the source data to output.

FirstRowHasNames=true|false

Use this boolean option to hint parsers that the source data has headers or not.

IgnoredTags=<tag>[,...]

Use this option to disable creating columns for the specified XML tags.

For example, an XML node has the following structure:

<passages>
<passage><hlword>Yandex</hlword> - the search engine that can...</passage>
</passages>

To ignore the hlword tag, use:

ignoredTags=hlword

NoSourceHeaders=True

Use this option to suppress using HTML table headers as column names.

For example, use this option to import financial statements with a periodic structure but dynamic column headers.

Use this option when the first row of CSV and text files or HTML table headers does not contain column names.

In this case, the add-in creates columns with names like F1, F2, and so on.

This option is extremely useful with the financial statements that use years or quarters as column headers.

This option is opposite to the firstRowHasNames option.

Pages=<number of pages>

Use this option to set the number of loaded pages (for example, the number of option chain pages).

For example:

pages=20

This option works if the add-in can detect next page URLs.

RootPath=<root path>

Use this option to set the root element to output.

Here are several examples for XML, JSON, and HTML sources:

rootPath=.InvoiceTransmission.Invoice
rootPath=calls,puts
rootpath=*.*.timestamp,*.*.*.quote,*.*.*.adjclose
rootPath=1

Without this option, the add-in detects output columns automatically.

SaveToDB supports multiple roots with JSON and HTML and a single root with XML.

RowValues=True

Use this option to output XML and JSON values in rows.

This option is useful for learning document structures.

SkippedNodes=<node>[,...]

Use this option to exclude XML or JSON nodes from the output.

You can specify column paths like parent1.parent2.column or suffixes like .column.

For example, to exclude columns with data types defined in the type attributes, use:

skippedNodes=.type