Connecting to Web Data

Connecting to Web Data

SaveToDB Add-in for Microsoft Excel allows connecting to web data sources.

The following web data sources are supported:

  • Web services including OData services
  • Web pages

SaveToDB supports the following web data formats: HTML, XML, JSON, and CSV.

SaveToDB supports the following authorization methods: Windows, Basic, Forms, OAuth 1.0, and OAuth 2.0.

SaveToDB supports the following authorization providers: Google, Facebook, LinkedIn, Twitter, Microsoft Azure Marketplace, Windows Live, Yahoo, and Yahoo API Key.

SaveToDB Standard and Enterprise allows saving data changes to OData web services.

Selecting Provider

Use Gartle Web Data Provider to connect web data.

Connecting to Web Data Source

This step differs from the same step for database connections.

You have to specify the following data:

  • URL
  • Service URL
  • OAuth Provider
  • Scope
  • Accept

When you change the URL, the add-in tries to discover the required parameters in the background.
If the parameters cannot be discovered by the add-in, or other values should be used, you may specify them manually.

Any time you may test the connection with the current connection data.

The Next button is available for successfully connected OData web services as a web service object is selected in the next step.
For other web data sources, the Finish button is used as the specified URL data is inserted in Microsoft Excel.

URL

In general, you may insert the URL from a browser.

SaveToDB also allows redefining HTTP query parameters. See below.

You may also add the pages and rootPath parameters to the end of the URL. See below.

Service URL

The service URLs are defined unambiguously for OData web services only. The service URL is the service root URL that returns the service document.

The service URL for other web services and websites is a URL of any protected area.

A website can contain multiple protected areas. The root URLs of these areas can be used as service URLs.

Such areas cannot be found in all cases. So, sometimes you may need to specify the service URL manually.

Don't worry if the service URL is not defined correctly. In this case, you can be asked for authorization for multiple URLs from the website.

OAuth Provider

SaveToDB supports the following authorization methods:

  1. Windows
  2. Basic
  3. Forms
  4. OAuth 1.0
  5. OAuth 2.0

The first three methods are discovered automatically. The OAuth providers are discovered by website URLs.

You can manually set or clear the OAuth provider.

Scope

The scope is used for OAuth 2.0 providers only. SaveToDB discovers the known scopes for known OAuth providers.
However, the scope requirements are very different for various providers and websites. So, you may specify the required scope manually.

Accept

You may the Accept header by default or tune it if an error is occurred.

Redefining HTTP Query Parameters

The SaveToDB add-in detects parameters of HTTP queries automatically.

The parameters can be redefined in the form {<Parameter name>=<Default value>}.

For example, the URL https://finance.yahoo.com/q/hp?s=GOOG+Historical+Prices

has the "s" parameter with the "+Historical+Prices" suffix to the stock symbol.

You can configure the URL like https://finance.yahoo.com/q/hp?s={Symbol=GOOG}+Historical+Prices

In this case, the Symbol parameter is used to specify the stock symbol only.

The parameters are placed on the ribbon and allow changing the query further.

Pages and RootPath Parameters

;Pages=<number of pages>

The option specifies the number of loaded pages for the specified URL.

SaveToDB tries to find the next page URL and to load the next pages.

For example, you may load option data for multiple expiration dates from Yahoo! Finance:

https://finance.yahoo.com/q/op?s={Symbol=AAPL};pages=20

;RootPath=<root path>

The option specifies the root element of columns to output.

For XML and JSON, SaveToDB parsers look for the first element with the full column name ended with the specified value.

The full column name consists of all column names in the path separated by dots.

For JSON documents, you may specify multiple roots separated by comma or semicolon.

For HTML, you may specify a table number to output, not column name.
To find the right value, just try different values: ;rootPath=1, ;rootPath=2, and so on.

For example, you may load option data for calls and puts from Google Finance:

http://www.google.com/finance/option_chain?q={Symbol=AAPL}&authuser=0&output=json;RootPath=calls,puts;pages=2

Also, you may load Income Statements, Balances, CashFlows from Google Finance using the following URLs:

https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=2
https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=4
https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=6
https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=3
https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=5
https://www.google.com/finance?q={Symbol=AAPL}&fstype=ii;RootPath=7

;CollapsedNodes=<node>[,...]

The option defines the XML nodes that include values of children nodes.

Example:

;CollapsedNodes=passages

;SkippedNodes=<node>[,...]

The option defines suffixes of XML nodes to exclude from the output.

To exclude specific columns, use complete paths with column names separated by dots like parent1.parent2.column.

Example:

;SkippedNodes=.type

;IgnoredTags=<tag>[,...]

The option defines tags to skip as separate columns. The values are included into parent columns.

The typical scenario is ignoring text highlight tags. The option is applicable only with XML.

Example:

;IgnoredTags=hlword

;NoSourceHeaders=True

The option suppresses using HTML table headers as column names.

Use this option to import financial statements from the web with static column names.

;AsIs=True

The option suppresses any special processing of the output data.

For example, the add-in converts Yahoo timestamps to datetime values. You may disable this using the AsIs option.

;RowValues=True

This option applies XML or JSON parsers that print all values in rows.

This option is useful for learning document structures.

Selecting OData Object

This step allows selecting an OData web service object to connect.

OData containers are similar to database query list views.

Using the query list, you may select a container and then an object of the container.

Specifying Parameters

This step is used to specify parameters of OData FunctionImport objects and HTTP queries.

Further, you may work with web data like with other database data.

This website is using cookies. By continuing to browse, you give us your consent to our use of cookies as explained in our Cookie Policy.