4.1.1 Data Manipulation


Selecting and Coloring Worksheet Cells

Origin has two x-functions for conditional selection - wcellsel and wxt; The former(wcellsel) is cell-based on an arbitrary range of cells in a worksheet. The latter(wxt) is equivalent to the worksheet extract dialog that works at the column level.

The following code uses wcellsel X-function to select cells that meet a condition, and then uses wcellcolor to color them.

// create new book with some columns and fill with random data
newsheet cols:=6 xy:="Y";
// select all cells with value greater than or equal to 0.5 and color them
wcellsel 1:end c:=ge v:=0.5;
wcellcolor c:=color(green);
wcellcolor c:=color(red) type:=1;
wcellsel; // this deselects cells so colors show

Coloring Worksheet Rows by wxt

While the wcellsel function can select any worksheet cells, it can only select those cells with a simple condition for the entire range of cells. If you only need to select rows, and you need more precise conditions, then you can use the wxt X-Function, which is basically the LabTalk version of the Worksheet Query dialog. Fill two column with random numbers before running the following example:

//select rows (sel:=1) that meet condition
//col(2)>0.3 for the active worksheet
wxt "col(2)>0.3" sel:=1;

//change the color of the rows of the selected cells  
wcellcolor c:=color(green);

//reset the worksheet selection so you can see the colors correctly  

Using wxt with Text Columns

There are special considerations when the condition involves text strings. The following example shows a more complicated condition to select based on both a text column and a numeric column.

//import a sample file that has make of cars and their various characteristics
fname$=system.path.program$ + "Samples\Statistics\automobile.dat";
impasc options.sparklines:=0;
// Select all rows in worksheet with make=Acura and Gas Mileage > 11
// Use range variable to make query more readable than "col(Make)[i]$"
range car=col(Make);
// Using string variable to avoid having to write "Acura" inside " " 
string str="Acura";
//wxt will translate "=" to "==", "and" to "&&"
wxt "car=str$ and col(Gas Mileage)>11" sel:=1; // Select the rows
wcellcolor c:=color(green); // Color those rows
wcellsel; // Clear the selection

Copy and Paste Data to New Sheet and Apply Transpose

This code uses copydata X-function to copy a block of numeric data to a new worksheet and then use wtranspose X-function to transpose the data.

//Create a new worksheet with six columns of random number 
newsheet cols:=6 xy:="Y";
//Highlight a block in the worksheet
worksheet -s 3 3 4 6;  
//Copy the highlighted block into a new worksheet
copydata  orng:=<new>!<new> clear:=1;
//Transpose the resulting worksheet

copydata X-Function doesn't work for text data. So if the data block has both numeric and string data, please use range -v notation to define blocks as range variables and assign original block to new block to paste the value. Then use wtranspose to transpose the data.

//Import a sample data into a new book
fname$=system.path.program$ + "\Samples\Statistics\Protein Consumption in Europe.dat";
//Define a block of string data;
int nc = 4; // number of columns in the block
int nr = 10; // number of rows in the block
int fi = 4 + nr; // final row index of the block
range -v r1 = 1[4]:$(nc)[$(fi)]; // Set input range
//Create a new worksheet ;
wks.ncols = nc; 
int fo = 1 + nr; // final row index of result range
range -v r2=1[1]:$(nc)[$(fo)];  // Set result range
//assign values in r1 to r2
//Transpose the block
// Set the first row as Long name
wrow2label longname:=1

Splitting and Extracting Data to New Sheets

This code demonstrates the use of the discfreqs and wxt X-Functions as well as simpler data manipulation to extract data from a worksheet and place each discrete set into its own sheet. The discfreqs X-Function is an OriginPro-only feature.

//create a new book and import data file
newbook sheet:=1; 
string fname$ = system.path.program$ + "Samples\Data Manipulation\US Metropolitan Area Population.dat";
//store the values of the desired column in a range
range cc = [%(page.name$)]%(wks.name$)!col(4); 
string originalsheet=%(wks.name$);

// create two new columns for our split variables
wks.nCols = wks.nCols + 2;
wks.col5.lname$=City ;

// Read over each entry in the range...
for (ii=1; ii<=cc.getsize(); ii++)
	initialItem$=cc[ii]$; //create a string of the cell...
	// write the variables to the iith cell of our two new columns...
// Analyze the new "State" row to find the discrete "State" designations
discfreqs irng:=col(State) rd:=[<input>]Frequency!;
// Store the designations in a range
range ff = [%(page.name$)]Frequency!col(1);
// Loop over each unique State value from the discrete frequency output
for (jj=1; jj<=ff.getsize(); jj++) 
	// This string allows the conditional to point to a different
	//  row of the frequency table with each iteration
	string state$ = [%(page.name$)]2!col(1)[jj]$; 
	// The wxt X-Function requires that the user inclue the "[i]" term
	//  when using a string as part of the conditional test
	string statequote$ = ""%(state$)""; 
        string cond$ = "col(6)[i]$==" + statequote$; 
	// Wxt extracts each row of the original sheet corresponding
	//  to the currently tested state to a new sheet
	wxt test:=cond$ iw:=1 ow:="%(state$)"; 

Subgroup a Worksheet to New Worksheets

This code demonstrates the use of the discfreqs and wxt X-Functions as well as simpler data manipulation to extract data from a worksheet and place each discrete set into its own sheet. Tree and StringArray variables are used to hold the results of discrete frequency count. The discfreqs X-Function is an OriginPro-only feature.

//create a new book and import data file
newbook sheet:=1;
string fname$ = system.path.program$ + "Samples\Statistics\automobile.dat";

//store the values of the desired column in a range
range makeCol = [%(page.name$)]%(wks.name$)!col(make);
range sourceWks = [%(page.name$)]%(wks.name$)!;

//create a tree to hold the result of frequency count and do frequency
//count to "Make" column to find the discrete "Make" designations
//this X-Function is only available in OriginPro
tree tr;
discfreqs irng:=makeCol rd:=tr;

//create a string array and put the "Make" designations into it
StringArray sa;

//if sa contains data
//loop over each unique "Make" designation
//and extract data from original worksheet to new worksheets
if (sa.GetSize() != NANUM)
	for (ii=1; ii<=sa.GetSize(); ii++)
		//this string is used to generate condition to 
		//point to different "Make" designation stored
		//in string array for each iteration
		string newWk$ = sa.GetAt(ii)$;

		//construct tested condition for wxt
		string cond$ = "makeCol$==" + newWk$;

		//wxt extracts rows of the original sheet according to 
		//the currently tested state and put them to a new 
		//worksheet named by the corresponding designation
		wxt test:=cond$ iw:=sourceWks ow:=[<input>]<new %(newWk$)>;

Masking the Outliers

This script demonstrates the use of rowstats and wcellmask X-Functions to perform descriptive statistics on rows and mask the cell whose value is defined as outlier.

// create a new book and import data file
newbook sheet:=1;
string fname$ = system.path.program$ + "Samples\Graphing\Contour.dat";

// get number of columns and rows
int nCols = wks.nCols;
int nRows = wks.nRows;

// add two columns for mean and sd

// perform row stats
rowstats -r 0 irng:=col(1)[1]:col($(nCols))[nRows] mean:=col(Mean) sd:=col(SD);

// mask outlier
for(int ii=1; ii<=nRows; ii++)
	double dLower = col(Mean)[ii]-col(SD)[ii];  // mean-sd
	double dUpper = col(Mean)[ii]+col(SD)[ii];  // mean+sd
	for(int jj=1; jj<=nCols; jj++)
		double dOutlier = col($(jj))[ii];  // get cell value
		if(dOutlier<dLower || dOutlier>dUpper)
			wcellmask irng:=col($(jj))[ii];  // mask the cell


RGB Image Matrix

While a matrix contains information on how many rows and columns to display, it is internally stored as a linear array. This fact can be used to copy data from a matrix to a worksheet column as in this instance where we

  1. Import an image file
  2. Extract the Red, Green and Blue (RGB) channels (matrix to matrix)
  3. Copy the channel data (matrix to column)
  4. Construct a string representation of the RGB values
  5. Calculate a histogram of RGB values
// Point to a sample image and import
fname$ = SYSTEM.PATH.PROGRAM$ + "Samples\Image Processing and Analysis\Flower.jpg";
// Split the RGB into separate Red, Green and Blue matrices
imgRGBsplit r:=[RGBOUT]RED!1 g:=[RGBOUT]GREEN!1 b:=[RGBOUT]BLUE!1;
// Create a new Workbook
wks.ncols = 6;
wks.col1.lname$ = Red;
wks.col2.lname$ = Green;
wks.col3.lname$ = Blue;
wks.col4.lname$ = RGB;
wks.col5.lname$ = Value;
wks.col6.lname$ = Count;
// Worksheet ranges
range raDR = 1, raDG = 2, raDB = 3, raDRGB = 4;
// Matrix ranges
range raR = [RGBOUT]RED!1;
range raG = [RGBOUT]GREEN!1;
range raB = [RGBOUT]BLUE!1;
// Copy data from Matrix to Workbook by range assignment
raDR = raR;
raDG = raG;
raDB = raB;
// Create a string showing the RGB triplet
col(D) = col(1)$ + " " + col(2)$ + " " + col(3)$;
// Calculate the histogram
discfreqs irng:=raDRGB rd:=col(Value);

Convert Multiple Matrices into Worksheets

Suppose we have scanned X, Y, and Intensity data stored in a matrixbook for every height Z, and we have 20 such matrixbooks (scaen at 20 heights in Z direction) with Long Name as "Z2um", "Z4um", "Z6um"..., in which the number between "Z" and "um" means the Z height. Now we want to combine all scanned X, Y, Intensity matrices into one worksheet of XYZI structure.

This is common routine for 3D confocal microscopy images and time-lapse images.

doc -e M 
	string strMat$ = page.longname$;
	strMat$ = strMat.Between("Z","um")$; // Get Z Values from Matrixbook Long Name
	int ZHeight = %(strMat$);
	m2w method:=xyz; // Convert Matrix to XYZ
	wks.addcol(); // Add Z column
	csetvalue col:=col(4) formula:="ZHeight"; // Set Value from Numeric Extracted from Matrix Long Name

//Append converted worksheets into one
string strPWKB$ = page.name$; // Let Current Workbook as Parent Workbook
string strPWKS$ = wks.name$; // Remember Current Worksheet for Appending Other Sheets

doc -e W
	string strTempWKB$ = page.name$; // Book Short Name of Current Workbook
	if (strTempWKB$ != strPWKB$) // If not Parent Workbook, Start Appending
		wAppend irng:=([%(strPWKB$)]1!, [%(strTempWKB$)]1!) method:=row ow:=[%(strPWKB$)]1!; // Appending
		win -c %(strTempWKB$); // Delete Appended Worksheets

dataset sps = {4,1};
dataset sodr = {1,1};
wsort nestcols:=sps order:=sodr; // Sort Appened Worksheets by Z (Height)
colmove rng:=col(4) operation:=left; // Move Z column to be 3rd Column

page.longname$= "XYZI Table";
page.title=1; // Naming

Normalize a Matrix

Internally, Origin'x matrices are one dimensional vector-like arrays. Therefore mathematical operations performed are them are similar to those of vectors rather than true matrices.

// Create a matrix
newbook mat:=1;
mdim cols:=10 rows:=10;
msetvalue formula:="rnd()";

// Normalize it to [0, 1]
vnormalize ix:=mat(1) method:=range ox:=<input>;