Extracting Data

download Extracting Data

of 17

Transcript of Extracting Data

  • 7/21/2019 Extracting Data

    1/17

    Extracting Data

    Extracts are saved subsets of a data source that you can use to improve performance, upgradeyour data to allow for more advanced capabilities, and analyze offline. You can create an extractby defining filters and limits that include the data you want in the extract. After you create anextract you can refresh it with data from the original data source. You can either fully refresh the

    data, replacing all of the extract contents; or you can increment the extract; which only addsrows that are new since the last refresh.

    Extracts can:

    Improve performance. For file based data sources such as Excel or Access, a full extracttakes advantage of the Tableau data engine. For large data sources, a filtered extractcan limit the load on the server when you only need a subset of data.

    Add functionality to file based data sources, such as the ability to compute CountDistinct.

    Provide offline access to your data. If you are traveling and need to access your dataoffline, you can extract the relevant data to a local data source.

    Creating an ExtractUsing ExtractsRefreshing ExtractsAdding Rows from a FileUpgrading Legacy ExtractsOptimizing ExtractsUpdating Extracts on Tableau ServerTableau Data Extract Command-Line UtilityTableau Data Extract API

    Creating an Extract

    1. Select a data source on the Datamenu and then select Extract Datato open the ExtractData dialog box.

    http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_create.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_create.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_using.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_using.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_refresh.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_refresh.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_addfromfile.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_addfromfile.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_legacy.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_legacy.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_optimize.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_optimize.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_push.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_push.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE_API.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE_API.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE_API.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_push.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_optimize.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_legacy.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_addfromfile.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_refresh.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_using.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_create.html
  • 7/21/2019 Extracting Data

    2/17

    2. Optionally define filters to limit the data that will be extracted. Any fields that are hidden

    in the Data window will be automatically excluded from the extract. Click the Hide AllUnused Fieldsbutton to quickly remove them from the extract.

    To add filters, click the Addbutton under the Filters list.

    3. Specify whether to Aggregate data for visible dimensions. When you select thisoption the measures are aggregated using their default aggregation. Aggregating thedata can minimize the size of the extract file and increase performance.

    When you choose to aggregate the data you can also choose to Roll up dates to aspecified date level such as Year, Month, etc.

    The examples below show how the data will be extracted for each aggregation option.

  • 7/21/2019 Extracting Data

    3/17

    OriginalData

    Each record is shown as a

    separate row. There are 7 rowsin the data source.

    Aggregate

    Data

    (no roll up)

    Records with the same date and

    region have been aggregated

    into a single row.There are 5rows in the data source.

    Aggregate

    Data

    (roll up

    dates to

    Month)

    Dates have been rolled up to the

    Month level and records withthe same region have beenaggregated into a single

    row.There are 3 rows in the datasource.

    4. Select the number of rows you want to extract. You can extract All, the Top N rows, or aSample from the data source. Tableau first applies any filters and aggregation and thenextracts the number of rows from the filtered and aggregated results.

    The number of rows options depend on the type of data source you are extracting from.For example, not all data sources support sampling so that option is not alwaysavailable.

    5. When finished, click Extract.

    6. In the subsequent dialog box, select a location to save the extract into and give the file aname. Then click Save.

    Depending on the size of your data source, extracting data can take a long time. However, afteryou have extracted the data and saved it to your hard drive, performance will improve.

    Using Extracts

  • 7/21/2019 Extracting Data

    4/17

    After you create an extract, the current workbook begins using the extract. However, the extractconnection is not saved with the workbook until the next time you save. That means, if you closethe workbook without saving first, the workbook will connect to the original data source the nexttime you open it.

    You may want to create an extract with a sample of the data so you can set up the view and

    then switch to the whole data source, thus avoiding long queries every time you place a field onthe shelf. You can toggle between using the extract and using the entire data source byselecting a data source on the Datamenu and then selecting Use Extract.

    You can remove an extract at anytime by selecting a data source on the Datamenu and thenselecting Extract > Remove. When you remove an extract you can choose to Remove theextract from the workbook only or Remove and delete the extract file, which will delete theextract from your hard drive.

    You can see when the extract was last updated and other details by selecting a data source onthe Datamenu and then selecting Extract > History.

    Refreshing Extracts

    When the underlying data changes, you can refresh the extract by selecting a data source onthe Datamenu and then selecting Extract > Refresh. Extracts can be configured to be fullyrefreshed, replacing all of the data with whats in the underlying data source, or incrementallyrefreshed, adding just the new rows since the last refresh.

    Full Extracts

    By default, extracts are fully refreshed. That means that every time you refresh the extract, all ofthe rows are replaced with the data in the underlying data source. While this kind of refreshensures you have an exact copy of what is in the underlying data source, it can sometimes takea long time and be expensive on the database depending on how big the extract is.

    If the extract is not set up for incremental extract, selecting refreshing the extract will fullyrefresh the extract. If youre publishing the data source to Tableau Server, you can specify thetype of refresh in the Scheduling & Passwords dialog box.

  • 7/21/2019 Extracting Data

    5/17

    Incremental Extracts

    Rather than refreshing the entire extract, you can set it up to only add the rows that are newsince the last time you extracted data. For example, you may have a data source that isupdated daily with new sales transactions. Rather than rebuild the entire extract each day, youcan just add the new transactions that occurred that day. Then once a week you may want to do

    a full refresh just to be sure you have the most up to date data.

    Follow the steps below to set up an extract to be incrementally refreshed.

    1. Select a data source on the Datamenu and then select Extract.

    2. In the Extract Data dialog box, select All rowsas the number of Rows to extract.Incremental refresh can only be defined when you are extracting all rows in thedatabase. You cannot increment a sample extract.

    3. Select Incremental refreshand then specify a column in the database that will be usedto identify new rows. For example, if you select a Date field, refreshing will add all rowswhose date is after that last time you refreshed. Alternatively, you can use an ID columnthat increases as rows are added to the database.

    4. When finished, click Extract.

    The steps above can be used to define a new extract or configure an existing extract forincremental refresh. If you are editing an existing extract, the last refresh is shown so you canbe sure you are updating the extract with the correct data.

    If you publish the data source to Tableau Server you can specify a schedule for incrementalrefresh as well as full refresh in the Schedules & Passwords dialog box.

    Extract History

    You can see a history of when the extract was refreshed by selecting a data source onthe Datamenu and then select Extract > History.

    The Extract History dialog box shows the date and time for each refresh, whether it was full or

    incremental, and the number of rows that were added. If the refresh was from a file, it alsoshows the source file name.

  • 7/21/2019 Extracting Data

    6/17

    Adding Rows from a File

    You can add new data to an extract from a file. For example, you may take an extract from adata warehouse that has the past ten years worth of data. However, new data has been kept inan Excel workbook. You can add the new data to the extract so that you can analyze the mostrecent information against the historical data.

    Follow the steps below to add data from a file.

    1. Select a data source on the Datamenu and then select Extract > Add Data From File.

    2. Browse to and select the file that has the new data.

    3. Specify any Joins or Custom SQL necessary. The columns in the file must match thecolumns in the extract.

    4. When finished, click OK.

    The new rows are added to the extract. You can see a summary of the number of rows thatwere added by selecting a data source on the Datamenu and then selecting Extract > History.

    When you refresh this extract, the data will be replaced with the data from the original datasource.

    Upgrading Legacy ExtractsIf you have data extracts that were created before version 6.0, you should upgrade the extractsto use the data engine. When you open the workbook, you are given the option to upgrade theextracts.

  • 7/21/2019 Extracting Data

    7/17

    You can also upgrade the extracts by selecting a data source on the Datamenu and thenselecting Upgrade Extract.

    Optimizing Extracts

    To improve performance when working with extracts you can optimize the extract. Optimizing an

    extract creates secondary structure in the extract that speed up future queries.Optimize the extract by selecting a data source on the Datamenu and then selecting Extract> Optimize.

    The following types of optimizations are made:

    Materialized Calculated Fields

    Calculated fields are computed in advance and stored in the extract. In future queries, Tableaucan look up the already computed value rather than running the computation again. Thefollowing types of calculated fields ARE NOT materialized:

    Calculations that use unstable functions such as NOW() and TODAY()

    Calculations that use external functions such as RAWSQL and R

    Table calculations

    In addition, if the formula for a materialized calculation changes or the calculation is deletedfrom the data source, the materialized calculation is dropped from the extract until the extract isoptimized again.

    Acceleration Views

    When a workbook contains filters that are set to show only relevant values, computing theavailable values for that filter can be an expensive query. For these filters, Tableau mustevaluate the other filters in the workbook first and then compute the relevant values based ontheir filter selections. To speed up these queries, a view can be created that computes thepossible filter values and caches them for faster lookup later.

    Updating Extracts on Tableau Server

  • 7/21/2019 Extracting Data

    8/17

    You have the following options for updating extracts published to Tableau Server orTableau Online:

    You can add the extract or a workbook that connects to it to a refresh schedule inTableau Server or Tableau Online (cloud-based data sources only).

    You can update the extract in Tableau Desktop and then republish it.

    You can add to or refresh the extract in Tableau Server or Tableau Online without firstadding to or refreshing the extract in Tableau Desktop.

    The remainder of this topic describes the third option.

    Refreshing Extracts Using Tableau Desktop

    Before you attempt to update an extract, verify the following:

    The data source was originally published as an extract.

    Tableau Desktop is connected to the published data source, as indicated by the Tableau

    Server icon next to the data source name in the Data window:

    To refresh an extract on Tableau Server or Tableau Online from Tableau Desktop, right-click the data source in the Tableau Desktop Data window, selectTableau Data Server, andchoose one of the following options:

    Refresh from Source

    Refreshes the entire extract using the data in the original data source.

    This command is available only for extracts that include a connection to the original datasource. If you connected directly to a Tableau Data Extract file (.tde) and then publishedit, the connection to the original data source is not included.

    Append from File

    Updates the extract from the contents of a file.

    If you do not see the Tableau Data Server option, your data source may not be on TableauServer or Tableau Online (in which case it will not show the icon above). If you see theTableau Data Server option, but both commands are unavailable, the data source exists onthe server, but it is not an extract.

    It is also possible to update an extract on Tableau Server using a command-line utility.SeeTableau Data Extract Command-Line Utility .

    Tableau Data Extract Command-Line Utility

    http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.htmlhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html
  • 7/21/2019 Extracting Data

    9/17

    The Tableau Data Extract Utility is installed with Tableau Desktop. You can use this utility ata Windows command prompt to refresh or add files to extracts published to Tableau Serveror Tableau Online.

    To run the utility:

    1. Open the Command Prompt as an administrator and change to the Tableau Desktop bin

    directory. For example:

    cd C:\Program Files\Tableau\Tableau 8.1\bin

    2. Use either of the following commands, adding parameters from the command-optionstables below.

    o tableau refreshextract

    o tableau addfiletoextract

    Note:When using the utility, always specify tableauon the command line or in scripts,

    never tableau.exe.

    Syntax and parameters for the tableau refreshextract command

    Use tableau refreshextractto refresh an extract on Tableau Server or Tableau Online.

    Refreshing an extract updates an existing extract with any modifications that have beenmade to the data source since the last refresh.

    To see help for this command, at the Windows command prompt, type the followingcommand:

    tableau refreshextract --help

    All options have a full form that you use with a double hyphen (for example, --server).

    Some options also have a short form that you use with a single hyphen (for example, -s). If

    the value for an option contains spaces, enclose it in quotation marks.

    tableau refreshextract command options

    Short Form Full Form Description

    --source-username

    A valid username for the data source connection.

    Use this option with --source-password, or use

    --original-fileinstead of the username and password op

    --source-password

    The password for the data source user.

    --original-file

  • 7/21/2019 Extracting Data

    10/17

    filename>

    --force-full-

    refresh

    If the data source is set up for incremental refreshes, use t

    option is not included, an incremental refresh is performed

    refresh.

    -s --server

    The URL for the Tableau server on which the data is publis

    use http://online.tableausoftware.com.

    -t --site

    In a multiple-site environment, specifies the site to which the argument if your username is associated with more than one site, the default site is assumed.

    The site id is independent of the site name, and it is indicatedbrowser. For example, if the URL for the page you see after s

    https://online.tableausoftware.com/t/vernazza/view

    the site id is vernazza.

    --datasource

    The name of the data source, as published to Tableau Ser

    --project

    The project to which the data source belongs. If this option

    assumed.

    -u --username

    Valid Tableau Server or Tableau Online user.

    -p --password

    The password for the specified Tableau Server or Tableau

    --proxy-username

    The username for a proxy server.

    --proxy-password

    The password for a proxy server.

    -c

    --config-file

    Path and filename information for a file containing configur

    Config Filebelow for details.

    Sample tableau refreshextract command

    The following command refreshes an extract named CurrentYrOverYrStats that has beenpublished to an on-premiseTableau Server. This command specifies the following:

    http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Usinghttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Usinghttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Using
  • 7/21/2019 Extracting Data

    11/17

    The name of your Tableau Server.

    Server user name and password.

    Project name.

    The name of the data source to refresh, along with the data source username andpassword.

    C:\Program Files\Tableau\Tableau 8.1\bin>tableau refreshextract --serverhttps://our_server_name --username OurServerSignIn --password OurServerPwd --project "New Animations" --datasource "CurrentYrOverYrStats" --source-username OurDatabaseSignIn --source-password OurDatabasePassword

    The following command refreshes an extract named CurrentYrOverYrStats that has beenpublished to Tableau Online. This command specifies the following:

    Tableau Online user and password.

    Tableau Online site and project names.

    The data source, which in this case is hosted by a cloud-based data source provider (forexample, Salesforce.com), and the username and password to sign in to the hosted datasource.

    C:\Program Files\Tableau\Tableau 8.1\bin>tableau refreshextract --serverhttps://online.tableausoftware.com --username [email protected] --passwordOurServerPwd --site vernazza --project "New Animations" --datasource"CurrentYrOverYrStats" --source-usernamedatabase_user@hosted_datasource_provider.com --source-password db_password

    Syntax for tableau addfiletoextract

    Use tableau addfiletoextractto append file content to an extract that has been

    published to Tableau Server or Tableau Online. This command combines the two files.

    If you want simply to update an existing extract with the latest changes, use

    the refreshextractcommand instead. Using addfiletoextractto update an existing

    extract will duplicate data instead.To see help for this command, at the Windows command prompt, type the followingcommand:

    tableau addfiletoextract --help

  • 7/21/2019 Extracting Data

    12/17

    All options have a full form that you use with a double hyphen (for example, --server).

    Some options also have a short form that you use with a single hyphen (for example, -s). Ifthe value for an option contains spaces, enclose it in quotation marks.

    tableau addfiletoextract command options

    Short Form Full Form Description

    --file Path and filename information for the data file containing data

    Access, a Tableau data extract, or a delimited text file. It canthe file is on a network share. For example,\\server\path\

    -s --server

    The URL for the Tableau server on which the data is publis

    use http://online.tableausoftware.com.

    -t --site In a multiple-site environment, specifies the site to which the

    must include this argument if your username is associated wyou do not specify a site, the default site is assumed.

    --datasource

    The name of the data source, as published to Tableau Ser

    --project

    The project to which the data source belongs. If this option

    assumed.

    -u --username

    Valid Tableau Server or Tableau Online user.

    -p --password

    The password for the specified Tableau Server or Tableau

    --proxy-username

    The username for a proxy server.

    --proxy-password

    The password for a proxy server.

    -c

    --config-file

    Path and filename information for a file containing configur

    Config Filebelow for details.

    Sample tableau addfiletoextract command

    C:\Program Files\Tableau\Tableau 8.1\bin>tableau addfiletoextract --serverhttps://our_server_name --username OurServerSignIn --password OurServerPwd --

    http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Usinghttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Usinghttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Using
  • 7/21/2019 Extracting Data

    13/17

    project New Animations --datasource "CurrentYrOverYrStats" --file"C:\Users\user1\Documents\DataUploadFiles\AprMay.csv"

    C:\Program Files\Tableau\Tableau 8.1\bin>tableau addfiletoextract --serverhttps://online.tableausoftware.com --username [email protected] --passwordOurServerPwd --site vernazza --project "New Animations" --datasource"CurrentYrOverYrStats" --file

    "C:\Users\user2\Documents\DataUploadFiles\AprMay.csv"

    Using a Config File

    You can use a plain text editor, such as Notepad or Text Edit, to create a config(configuration) file that you can use with either tableau refreshextractor tableau

    addfiletoextract. A config file can be useful if you expect to update the same data source

    regularly over time. Instead of having to type the same options each time you run acommand, you specify the config file. A config file also has the advantage of not exposinguser names and passwords on the command line.

    Create the Config File

    For example, say you created a file called config.txt and saved it to your Documents folder.And in the file, you included the parameter information shown below.

    For an extract published to an on-premise Tableau Server:

    server=https://our_server_nameusername=OurServerSignInpassword=OurServerPwdproject=New Animationsdatasource=CurrentYrOverYrStats

    For an extract from a hosted data source, published to Tableau Online:

    server=https://[email protected]=OurPasswordproject=New Animationsdatasource=CurrentYrOverYrStatssource-username=database_user@hosted_datasource_provider.comsource-password=db_password

    Reference the Config File from the Command Line

    After you create the config file, you run the tableau refreshextract or tableau addfiletoextractcommand, pointing to the config file as the only option you use on the command line.

    For example, to refresh the extract specified in the sample in theCreate the ConfigFilesection, you would run the following command (making sure that you are working in thebin directory for your version of Tableau Desktop):

    http://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Createhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Createhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Createhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Createhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Createhttp://onlinehelp.tableausoftware.com/v8.1/pro/online/en-us/extracting_TDE.html#Create
  • 7/21/2019 Extracting Data

    14/17

    C:\Program Files\Tableau\Tableau 8.1\bin>tableau refreshextract --config-fileC:\Users\user1\Documents\config.txt

    Syntax Differences for Config Files

    The syntax for specifying options inside a config file differs from the syntax you use on thecommand line in the following ways:

    Option names do not begin with dashes or hyphens.

    You use an equals sign (with no spaces) to separate option names from option values.

    Quotation marks are not necessary (or allowed) around values, even when they includespaces (as for the projectoption in the example shown earlier).

    Use Windows Task Scheduler to Refresh Extracts

    You can use Windows Task Scheduler, in combination with the Tableau Data ExtractCommand-Line Utility, to automate regular updates to Tableau Desktop data sources fromwithin your corporate firewall. You can configure a task to occur once per day, week, ormonth, or after a specific system event. For example, run the task when the computerstarts.

    To learn more, see theTask Scheduler How To...page in the Microsoft TechNet library.

    Tableau Data Extract API

    Use the Tableau Data Extract API to connect to data that is not currently a supportedTableau data source. With the Tableau Data Extract API, you create a program thataccesses and processes your data. You then use that program to create a Tableau DataExtract (TDE) file.

    The Data Extract API is available for developers on Windows and Linux platforms. Go totheGet the Data Extract API pageon the Tableau website, and choose the appropriateversion for your platform and programming language:

    Data Extract APIPython32-bit

    Data Extract APIPython64-bit

    Data Extract API - C/C++/Java32-bit

    Data Extract API - C/C++/Java - 64bit

    Notes for Developers

    The Data Extract API includes a sample program, makeorder, coded in each supported

    language to demonstrate a typical usage scenario: creating an extract containing product

    http://technet.microsoft.com/en-us/library/cc766428.aspxhttp://technet.microsoft.com/en-us/library/cc766428.aspxhttp://technet.microsoft.com/en-us/library/cc766428.aspxhttp://www.tableausoftware.com/data-extract-apihttp://www.tableausoftware.com/data-extract-apihttp://www.tableausoftware.com/data-extract-apihttp://www.tableausoftware.com/data-extract-apihttp://technet.microsoft.com/en-us/library/cc766428.aspx
  • 7/21/2019 Extracting Data

    15/17

    orders. The application creates the extract order.tde with several columns of different types.The general flow of the sample programs is:

    1. Open an Extract object to create a new file.

    2. Define the extracts schema using a TableDefinition.

    3. Add the Extract table.4. Insert rows.

    5. Close all objects.

    It is important to free memory by closing all objects, and it is particularly critical to ensureExtract objects are cleaned up properly, particularly in non-native execution environments.See the note sections below for language-specific details.

    String columns in a Data Extract can be 8- or 16-bit and can be sorted according to manyavailable collations. By default, strings are sorted according to their binary representation,though this can be changed on a per-table or per-column basis.

    Python Notes

    Objects in the Data Extract API are automatically closed by _del_when necessary. While

    garbage collection handles the vast majority of concerns related to releasing resources, it isimportant to note that the virtual machine provides no guarantee that any particular objectwill ever be freed. While most objects are merely memory, Extract objects represent

    physical files created when close is invoked. Therefore, it is not safe to rely on garbage

    collection to close Extract objects. We recommend using with statements to ensure Extract

    instances are cleaned up. Alternatively, you can explicitly call close.

    Java Notes

    Data Extract API objects are automatically closed by finalize() as necessary. The Java

    Virtual Machine does not guarantee that any particular object is ever garbage collected.While most objects are merely memory that can be safely reclaimed by the operatingsystem at JVM shutdown, Extract objects represent physical files that are createdwhen close()is invoked. Therefore, it is important to invoke Extract.close()for allExtract instances. We recommend using the try-with-resources construct introduced in Java7. For earlier versions of Java, you must call close()explicitly.

    C++ Notes

    Data Extract API objects should be managed according to standard memory management

    best practices, such as using stack variables or smart pointers. As in other languages, allobjects have a Close()method to free internal resources. Close()is invoked by the

    destructor when necessary. However, it is important to note that Extract::Close()may

    throw an exception, so it is safer to call it explicitly, rather than allowing an exception topotentially escape the destructor.

  • 7/21/2019 Extracting Data

    16/17

    C Notes

    Objects in the Data Extract C API are managed through opaque TAB_HANDLEs. Every

    created object must be closed. It is advisable to free objects in the reverse order of creation.

    Using the Data Extract API in Microsoft Visual Studio 2010Follow these steps to build a C or C++ project in Visual Studio 2010:

    1. Extract the non-Python package to an installation directoryforexample, C:\dataextract-8.0. This directory is the $(InstallRoot).

    2. Add $(InstallRoot)\binto the %PATH% environment variable.

    3. Open Visual Studio and create a new console project.

    4. Open the Property Manager and create a new Property Sheet for use with all targets inthe project. Add these paths:

    Path Description

    $(InstallRoot)\include C/C++|General|Additional Include Direct

    $(InstallRoot)\lib VC++ Directories|Library Directories

    dataextract.lib Linker|Input|Additional Dependencies

    5. Add MakeOrder.cor MakeOrder.cppfrom $(InstallRoot)\docs\samples.

    6. Compile and run the application.

    Using the Data Extract API in Eclipse

    Follow these steps to build a Java project in Eclipse:

    1. Extract the non-Python package to an installation directoryforexample, C:\dataextract-8.0. This directory is the $(InstallRoot). Verify that thepackage is compatible with your Java platform; 64-bit libraries do not work with a 32-bitJVM and vice-versa.

    2. Open Eclipse and create a new project. Add dataextract.jarand jna.jarto the build

    path as External JARs.

    3. Right-click on the projects src package and import$(InstallRoot)\docs\samplesasa local file system resource. Verifythatsamples/com/tableausoftware/demos/MakeOrder.java is selected beforedismissing the dialog.

    4. In the Run Configuration for MakeOrder, add an EnvironmentVariable PATH=$(InstallRoot)\bin.

  • 7/21/2019 Extracting Data

    17/17

    Using the Data Extract API with Python

    Follow these steps:

    1. Verify that your installed Python is version 2.x, where x is 7 or higher. Version 3 is notsupported. Also verify that it matches the package you have downloaded: the 64-bit

    module is incompatible with 32-bit Python and vice-versa.2. Extract the Python package, open a command prompt, and navigate to the directory that

    contains setup.py.

    3. Run setup.py to install the module into site-packages.