Append allows you to add new variables to respondents or cases within a project when information of common respondents is captured in separate data sources.
For example, one source contains the main survey information, and another source includes the weighting variables and you need these to link together.
- Based on the number and order of records, or
- Based on unique identifiers.
When using append, you also need to take into account what each record in each of the data files represents (Unit of Count).What a record in the data represents, whether it be a respondent (person), or an occasion, etc. Data may be collected at multiple levels, for example, one of the files contains information about respondents and a separate file contains information about their consumption occasions. If this is the case, you can append data across multiple levels by specifying the unit of count (Coming Soon).
Use append to add new variables into a project when respondents are common across data sources.
In this article
1. Append Data
You can append data sources when you create a project for the first time or append to a project that already contains the primary source. Regardless, the first step is to upload or connect to the data sources that are required in your project. Currently, there is no known limit on the number of secondary sources that you can append.
You need to identify the primary or parent source and then select the append option. Selecting the append option will start the append data wizard.
In the example below, there is an existing project which contains the primary source with the main survey information, and we want to append a secondary source which includes the weights.
- Select view/add sources.
- Select Add/Remove.
- This will open the sources browse area and display the data sources linked to your project.
- Choose Upload or Connect.
- browse to the location of the secondary data source (s).
- Ensure your primary source is selected and then click on Append Sources and work through the wizard.
2. Append Data Wizard
1. Link - Select sources to append to the primary source
The first step is to identify the secondary sources you want to append to your primary source, the column on the left displays the available secondary sources.
Once selected, on the right column, you will notice the linking hierarchy of sources. It is at this point that you need to define the unit of count.
If the unit of count of the secondary source is:
- The same as the primary source, usually respondents, you don't need to enter any information and can proceed to the next step.
- Different from the primary source (e.g. occasions) you need to enter the unit of count for each of the sources before you proceed to the next step. Learn more.
In the example below, each record in both the primary source and secondary source represents a respondent; therefore the same unit of count.
- Select the secondary source.
- Define the unit of count for each of the sources, if these are the same (e.g. respondents) you can leave blank).
- Proceed to the Next step.
2. Define - data types for delimited sources
Harmoni automatically maps variables There are six variable types in Harmoni: Headings, Axes, Grids, Measures, Weights and Verbatims. Learn more about Harmoni Variable Types. when data sources contain an inherent dictionary.Meta-data to guide interpretation Learn more about source dictionaries.
When this is the case, the append wizard displays a message indicating there are no delimited sources and that you can carry on to the next step.
Delimited sources (i.e. Excel-XLXS, comma-delimited-CSV or tab delimited-TXT) do not contain an inherent dictionary. Harmoni determines the best possible match to convert the source variable into Harmoni types, but you have the option to override the automated mapping. In this case, the append wizard will open the data type option wizard. Learn more about mapping source variables.
3. Match - data types for delimited sources
When appending variables to existing respondents with the same unit of count (e.g. records in all sources represent respondents), you have a couple of options to match the records:
a) Based on the number and order of records
Harmoni will check if the number of records in all selected sources is the same. If they are equal, the sources you've selected will merge with the primary source.
- Respondent number and order must be the same across the primary and secondary source.
- If the secondary file contains a different number of records, a warning message will appear: "The selected data sources do not have the same number of records." In this is the case, it is best to append based on unique identifiers.
If you select this option, your sources will append and once the append process is complete you will be taken back to the browse area.
b) Based on unique identifiers
Harmoni will use the unique identifier you specify in each source to merge records with the primary sources. Orphaned records are ignored.
- A unique common identifier must exist across the sources.
- Harmoni will ask you to select the identifier in both the primary and secondary source and will match the records based on the identifier.
- The unique identifier
- Can be either a measure (numeric) or verbatim (text), but it must be the same type across the sources you want to append.
- Must not have blanks, else Harmoni will flag them as duplicates if more than one is found.
- If the identifier is unique and can be matched, the sources will append.
- If the identifier is not unique (i.e. repeated across multiple records) the warning: "We have found duplicated records identifiers in your sources." In this is the case, you will need to insert unique identifiers in both your primary and secondary sources.
If you have multiple units of count (i.e. multi-level) you only have the option to match based on unique identifiers. Learn more.
If you select to append using unique identifiers, you need to complete one more step using the wizard!
4. Append - Select one unique identifier for each source
If you select to append based on a unique identifier, you will need to select the identifier in each of the sources. If the identifier is unique and can be matched, the sources will append and once the append process is complete you will be taken back to the browse area.
2. Append Data - Multi-Level (Coming Soon)
Data may be collected at multiple levels, for example, one of the files contains information about respondents and a separate file contains information about their consumption occasions. If this is the case, you can append data across multiple levels by specifying the unit of count.
Watch this space!