3.7.5.6 DatasetsDatasets
IntroductionDataset
A dataset is a basic Origin object, a one-dimensional array that can contain numeric and/or text values.
- The individual values in a dataset are called elements.
- Each element in a dataset is accessed by index number starting from 1, which is different from C convention where the index starts from 0. In newer versions, the index of last element is 0.
- Each dataset in an Origin Project has a unique name.
- A dataset may contain different data types
A dataset is often associated with a worksheet column and/or one or more data plots. When a dataset is deleted, associated worksheet columns and data plots are also deleted.
A dataset is also a LabTalk Variable, so it can either belong to the project, or be declared as a local variable. The following types of datasets are supported in Origin:
Column or Matrix Object: A dataset that is associated with a column in a worksheet or a Matrix Object in a Matrixsheet.
Loose Dataset: A Dataset which is not attached to a worksheet or matrixsheet.
Temporary Dataset: A type of loose dataset used as temporary dataset storage. Origin creates temporary datasets with names starting with an underscore ("_") character, e.g., _temp. Origin temporary datasets are deleted when the project is saved or a graph page is printied/copied. User-created temporary (loose) datasets are deleted when a project is closed or a new project is opened.
To see the names of the existing datasets, use List s command.
Creating a DatasetCreate Dataset
Columns in worksheet and matrix object in matrix sheet are automatically dataset objects, and can be accessed by col( ) or mat( ). E.g.
- col(1) refers to 1st column
- col(A) refers to the column with short name A
- col("Amplitude") refers to the column with long name Amplitude.
- mat(1) refers to 1st matrix object in active matrix sheet.
When you assign/output value to a non-existing column or matrix, the corresponding column or a matrix object will be created automatically added. E.g.
wks.ncols = 3; // Set active sheet to be only 3 columns
col(4)={1,2,3}; // 4th column will be automatically added in worksheet
There are three ways to create a loose dataset:
- Declare it with [LabTalk:Dataset_(command)|dataset command]]
- Use the Create (Command).
- Auto create with assignment.
Declaring a Dataset and the Dataset -a option
If you declare a dataset without using the -a option, it will be created as Numeric(Double) only. To create a Text & Numeric dataset, add the option -a.
Examples
// Declare and initialize a numeric dataset
Dataset ds1 = {1,2,3};
ds1[1]= ; //get 1st element value
type $(ds1); //display all numeric elements
// Declare a Text & Numeric dataset
dataset -a vv;
vv=data(1,10);//assign 1,2, ... 10 to vv. There are 10 elements
vv[2]$="Beta"; //set 2nd element to be a string(text)
vv[2]$= ; //get 2nd element string
type $(vv); //display all numeric element values. string will output as --
vv[0]$="Bye"; //same as vv[10]$="Bye"
vv[0]$= ; //get last element string . same as vv[10]$
// loose dataset mydata1 is created as result of assignment
mydata1 = col(1); // assign values in 1st column to mydata1
type $(mydata1); //display all numeric elements
Deleting a DatasetDelete Dataset
Use Labtalk command delete to delete dataset object. It works for loose and temporary dataset and dataset created by range, col(), mat() etc.
Note: Origin's LabTalk command can be shorted to as long as it's unique. E.g. delete can be shortened to del
//delete column
del col(B);
//delete range
range r3=3; //3rd column in active sheet
del r3; //delete the dataset referred by r3
del -ra r3; //delete the range variable
//delete dataset
dataset ds1={1,2,3};
list s; //list all datasets. Should see ds1 under Session: part
del ds1; // delete the dataset
del -al ds1 //delete the variable
list s;
If a dataset is created without declaration, it's a project variable. It will be saved with the project file.
mydata1 = data(1,30);
If a dataset is created with declaration, it is a session variable. It will be gone when current Origin session is closed.
dataset mydata2 = data(1,100);
In Script window, choose Tools: Variables... or run "edit" command to open a dialog to see all variables and their scope (type). There is Delete button there to delete variables.
Getting and Setting SizeGet dataset sizeSet dataset size
Before Origin 8, you had to use Set and Get command with a dataset name to set and get a dataset's size. In Origin 8 you can use the following two methods:
- GetSize()
- SetSize(int nSize)
which are part of the vectorbase OriginC class and inherited by a LabTalk dataset. In order to use these methods, you must have a variable of the type Range or type Dataset. For example:
Dataset ds1 = {1,2,3};
for( int ii = 1 ; ii <= ds1.GetSize() ; ii++ )
{
ds1[ii]= ds1[ii]^2;
};
type $(ds1);
You cannot use these methods on a dataset created by assignment without declaration:
temp = col(a); //temp is created by assignment
temp.GetSize()=; // should produce --
Instead, declare a Dataset object first:
Dataset temp = col(a); //temp is declared and initialized
// If the dataset is too small
if( temp.GetSize() < 50 )
// make it bigger
temp.SetSize(366);
temp.GetSize()=;
Accessing DatasetsAccessing Dataset
Datasets in Worksheets and in Matrices can be accessed by the following means:
- Range Variables
- Col() or Mat() function and WCol() function
- Cell() function
- Substitution Notation
- Dataset name
Range Variables
Range variables can be used to access data in worksheets, graphs, and matrices.
For Worksheet Data
string strBk$ = Book1;
string strSh$ = Sheet1;
range rDa = [strBk$]strSh$!col(A); // This is a data range
range rSh = [strBk$]strSh$!; // This is a sheet range
rDa = {1:30}; // data generation from 1 to 30 increment 1
// Use properties of a sheet range/object
if(rSh.ncols < 10)
// adding more columns can be simply done like this
rSh.nCols = 10;
loop(ii,2,10)
{
// re-assign range programmatically
range rDb = [strBk$]strSh$!wcol(ii);
rDb = normal(30);
rDb += rDa;
}
For Plotted (Graph) Data
You can get various ranges from plotted data as well.
// Get range ( [Book]Sheet!col ) of third dataplot in active layer
range -w rngData = 3;
rngData = rngData+10; //add 10 to each element
For Matrix Data
Similarly, you can access Matrix object with a Range variable and matrix elements by [rowNumber, colNumber].
newbook mat:=1; //create a matrix workbook
wks.ncols=5; //set number of columns to 5
wks.nrows=4; //set number of rows to 4
range mymat=1; //range variable mymat refers to 1st matrix object
mymat=data(1,20); //assign mymat with numbers from 1 to 20
Though a matrix looks like a 2 dimensional array (MxN table), it's actually a single dimensional array using row major ordering. and you can access a matrix value by:
- MatrixObject[N*(i - 1) + j]
where N is the number of columns in the matrix, and i and j is the row and column number. So for a 4 rows by 5 columns matrix, using [2, 3] and [8] returns the same cell. E.g. after above code, run
if(mymat[2, 3] == mymat[8])
type "They are the same!";
By Col(), Mat(), and WCol() function
Use Col( ) and WCol( ) to access columns in workbook
// Make sure there is a workbook named Book5 with Sheet2.
// Here, the win -o command is used to temporarily switch books and set
string strBk$ = Book5;
string strSh$ = Sheet2;
win -o strBk$
{
temp = page.active; // remember the active sheet
page.active$ = strSh$;
// col( ) function can take a column number (1..n)
col(1) = {1:30};
loop(ii,2,10)
{
// wcol( ) takes column number or an integer variable
wcol(ii) = normal(30);
// col( ) function can also take a Short Name
wcol(ii) += col(A);
}
page.active = temp; // Restore the originally active sheet
}
// Exit 'win -o', returning to previously active window
Similar to regular workbook, there can be multiple matrix sheets in matrix workbook. And on each matrix sheet, there can be multiple matrix objects which is an analog to columns in regular sheet.
Therefore Mat( ) can be used as analog to Col( ).
newbook mat:=1; //create a new workbook. default is 32x32 matrix
mat(1)=normal(32*32); //fill 1st matrix object with normalized random values
wks.nmats=2; //set number of matrix objects to 2.
matrix -it 1; //show image thumbnails to see all matrix objects in a matrix sheet
wks.active=2; //activate 2nd matrix object
mat(2)=data(0.01, 10.24, 0.01); //fill 2nd matrix object with values from 0.01 to 10.24, increment by 0.01
Cell FunctionCell Function
The Cell( ) function can be used with either a worksheet or a matrix. When referring to a non-active book, it can only refers to active sheet in the specified workbook with prefix bookshortname!.
cell(5,2) = 50; // Sets row 5, column 2 cell in active window
//(worksheet or matrix) to 50
// You can use prefix for different workbook/matrix window,
// but target is still the active sheet in that page
// Sets row 10, column 1 in active sheet of Book5 to 100
Book5!cell(10,1) = 100;
// Function accepts variables
for(ii = 1 ; ii <= 32 ; ii++)
{
for(jj = 1 ; jj <= 32 ; jj++)
{
MBook1!cell(ii,jj) = (ii-1) * 32 + jj;
}
}
%() Substitution Notation
Use Worksheet Column and Cell Substitution to access dataset in worksheet.
//access column in workbook
%(%H, A)=data(1,30); //active window, active sheet, column A
%([%H]Sheet2, A)=data(50,70, 0.5); //active window, sheet2, column A
//access cell in workbook
%(%H,2,1) = 1; // active book, active sheet, 2nd column, 1st row
%([Book2]Sheet3,2,1) = 100; //book2, sheet3, 2nd column, 1st row
%([Book2],2,1) = 10; // This may produce a command error in older versions
%(Book2,2,1) = 10; // Book2, active sheet, 2nd column, 1st row
// Substitution works before execution, so you can do things like:
%([Book2]Sheet1,2,1) + %([Book2]Sheet2,2,1) + %([Book2]Sheet3,2,1)=;
val = %([Book2]Sheet1,2,1) + %([Book2]Sheet2,2,2) + %([Book2]Sheet3,2,3);
type "Cross-sheet sum \x3D $(val)";
By Dataset Name
This is the oldest notation in LabTalk and should not be used unless you have to maintain old LabTalk code, or when you know the workbook has only one sheet or you are working with loose datasets.
// Dataset Name
// Using the formal name of a dataset :
// PageName_ColumnName[@SheetIndex]
// Where PageName and ColumnName are the Short Name
// and SheetIndex is a creation order index
// (SheetIndex is not the order of the sheet tabs in the window)
Book1_B = Uniform(100); // Can fill datasets
Book1_B@2 = Data(1,365); // Same book and same column name, but on the
// second sheet
July_Week1@3[5]$ = Sell; // Can access a particular row element
// (5th row of column 'Week1' on sheetindex
// 3 in book 'July')
BSmith = 6;
ChemLab_Grade@3[BSmith] = 86;
// Can use a variable to access row
//(6th row of 'Grade' in third sheet of 'ChemLab')
Accessing Dataset Storage
Beginning with Origin 2018b, there is LabTalk access to dataset Storage. Utility is limited at this time but will be expanded upon in later versions.
Prefix a dataset name with
dataset.
For example:
dataset.Book1_B.info.=; // returns information about content in the info. area of the dataset
dataset.Book1_B.info.Add(Something); // adds storage object "Something" to the dataset
dataset.Book1_B.info.Remove(Something); // deletes storage object "Something" from the dataset
Masking Cells
Mask CellCells in worksheet columns and matrixsheets can be masked by setting
- dataset<index> = 1
For example:
// Masking the cell 3 in column A
col(a)<3> = 1;
// Mask a cell in matrix
range rr = [mbook1]msheet1!mat(1);
rr<2> = 1;
// Unmask a cell
col(a)<3> = 0;
For matrix, you can also use <row, col> to specify a cell:
// Mask the 2nd row, 3rd column of current matrixsheet
%C<2,3> = 1;
// Mask the whole 2nd row of the current matrixsheet
%C<2,0> = 1;
// Mask the whole 3rd column of the current matrixsheet
%C<0,3> = 1;
To mask a specified range, please see the mark command.
Set Column Values with a Dataset
Minimum Origin Version Required: 9.0 SR0
You can generate a dataset of certain size with a formula and some before formula script, and later assign this dataset to a certain column. The syntax is:
dataset.setvalues("Formula", "Before formula script", "From", "To")
In the Formula you could call functions like in the F(x) menu of set column values dialog.
An example is shown as following:
newbook;
wks.addcol();
dataset ds1,ds2,ds3;//Define three dataset objects
//Set values for each object.
ds1.setvalues("i+grnd()","","1","20");
ds2.setvalues("sin(i)","","1","20");
ds3.setvalues("col(1)*col(2)","col(1)=ds1;col(2)=ds2;","1","20");
//Use ds3 dataset as the value of column 3
col(3)=ds3;
Note that the setvalues method of a dataset does not support recalculate options. For that, see the csetvalue X-Function.
Using Loose Datasets in X-FunctionsLoose Dataset, X-FunctionX-Function, Loose Dataset
Loose datasets are more efficient to use in a LabTalk script since they do not have the overhead of a column in a worksheet. If used inside a scope, they are also cleaned up automatically when the scope ends. For example, declare a loose dataset missingvals, and then use the vfind X-Function to assign row-numbers of missing values in column 1 (of the active worksheet) to it:
// Declare a loose dataset
dataset missingvals;
// Call vFind X-Function to find row numbers
// where missing values appear in col 1
vfind ix:=1 value:=NANUM ox:=missingvals;
// Access missingvals for results
for(int ii=1; ii<=missingvals.GetSize(); ii++)
type $(missingvals[ii]);
Many X-Functions produce outputs in the form of a tree and some of the tree nodes can be vectors. To access such a vector, you need to copy it to a dataset. You may directly put the tree vector into a column, but if the vector is used in subsequent calculations, it is more efficent to use a loose dataset.
|