4.5.2.2 Data Reduction


Video Image.png Video Text Image.png Website blog icon circle.png Blog Image 33x33px.png

Summary

Origin offers a number of tools which can be used for data reduction in worksheets: Data Filters, Worksheet Query and several X-Functions which reduce the number of data rows by different methods.

Minimum Origin Version Required: 2015 SR0

What You Will Learn

In this tutorial, you will learn how to:

  • Reduce XY data to evenly spaced X
  • Reduce Duplicate X Data for XY dataset
  • Reduce XY data by Group
  • Reduce worksheet rows

Steps

Reduce to Evenly Spaced X

  1. Create a new workbook and click the Button Import Single ASCII.png button to import the file Signal with High Frequency Noise.dat under <Origin EXE folder>\Samples\Signal Processing path.
  2. Highlight column B and select Analysis:Data Manipulation:Reduce to Evenly Spaced X to open the reduce_ex dialog. Edit the settings so that they match the following.
    Tutorial Data Reduction 06.png
  3. Click OK. A new column (column C) is added to the worksheet. This column contains its own sampling interval information. Click on the column header to select the column then choose Column: Show X Column. In the Show X Column: colshowx dialog box, click OK to generate an X column using the sampling interval. You can see that the original XY dataset has been reduced by generating a new, larger sampling interval.
    Tutorial Data Reduction 07.png
  4. Highlight column B and D (hold down the Ctrl key for multiple selection), and click the Button Line.png button to generate a line plot for original (Black) and reduced (Red) data.
  5. You can see from the plot that the data size is considerably reduced:
    Tutorial Data Reduction 08.png

Reduce Duplicate X

  1. Create a new workbook and click the Button Import Wizard.png button to open the Import Wizard. Select the data files Step01.dat, Step02.dat and Step03.dat under the file path <Origin EXE Folder>\Samples\Curve Fitting\. Change the Import Mode to Start New Rows and make sure the default import filter step is applied. Click Finish to import these data files.
  2. Highlight column A and B, select Analysis:Data Manipulation:Reduce Duplicate X Data to open the reducedup dialog. Duplicate the settings as depicted below:
    Tutorial Data Reduction 09.png
  3. Click OK to apply settings and in the results sheet Sheet2, we can see that for each X value, 3 duplicates are found. In the reduced dataset, the Y values for each duplicated X have been replaced by the sum of the Y values.
    Tutorial Data Reduction 10.png

Reduce/Combine Duplicate Rows

  1. Go to Sheet1 of the previous section, highlight column A and select Worksheet:Remove/Combine Duplicated Rows... to open the wdeldup dialog. Select Average from Merge Duplications by drop-down list, and click the right-sided triangle button on the Output Worksheet row to select <new>:New Sheet and check the Output Counts box and click OK.
    Tutorial Data Reduction 11.png
  2. Entire worksheet rows are reduced into the average values of merged rows which are determined by duplications in the selected column. A new column Counts is added to the end of the result worksheet wdeldup and it reports the number of duplicates found for each X value.
    Tutorial Data Reduction 15.png

The tool Reduce Duplicate X works for XY dataset only, while the tool Remove/Combine Duplicate Rows works for an entire worksheet.

You can also remove duplicate XY data from an XYZ dataset by using the Statistics on Columns tool, please refer to this Quick Help.

Reduce by Group

  1. Open a new workbook, then click the Button Import Single ASCII.png button to import the Magnetization.dat file under <Origin EXE folder>\Samples\Data Manipulation path.
  2. Highlight columns A and B and click the Button Line.png button to generate a line plot.
  3. Activate this graph and select Analysis:Data Manipulation:Reduce by Group to open the reducexy dialog. Edit your settings as below:
    Tutorial Data Reduction 12.png
  4. Click OK to reduce the data. The reduced XY dataset will be added as two new columns at the end of original worksheet:
    Tutorial Data Reduction 13.png
  5. The reduced XY dataset is added as a new data plot to the original graph:
    Tutorial Data Reduction 14.png

Reduce Worksheet Rows

  1. Open a new workbook, then Click the Button Import Single ASCII.png button to import the file Nitrite.dat (path is <Origin EXE folder>\Samples\Spectroscopy). There are 6392 data points in this file.
    Tutorial Data Reduction 01.png
  2. Highlight both columns in the Nitrite worksheet and select Worksheet:Reduce Rows to open the wreducerows dialog, and edit the settings to match the image below:
    Tutorial Data Reduction 02.png
    Note: You can select the arrow button right to the Output select box and choose <new>:New Column(s) to make the selection.
  3. When you click OK, only the first of every 10 rows is kept. Ninety percent of data points are discarded. Remaining data points are output to new columns:
    Tutorial Data Reduction 03.png
  4. Select the entire worksheet and click the Button Line.png button to generate line plots for original (Black) and reduced (Red) data:
    Tutorial Data Reduction 04.png
  5. Note that peak heights were changed with data reduction. To maintain data plot shape, we should keep more data points. Click the green lock on Graph 1 and choose Change Parameters to open the wreducerows dialog again. Change the value of Delete Rows to 3 and click OK.
  6. This time 25% of data points were kept, better preserving the shape of the original data plot.
    Tutorial Data Reduction 05.png