Monday, April 16, 2012

Vilno Table, Real Examples of How It Works

These are actual examples actually produced by the software product (not yet version 1.0, but very functional, as you can see, the technology works).

Because the tables produced by the software are not so easy to put into a blog post, I am using a link to a document in "Google Docs":


So click on it, and you will see the examples from the appendix of the white paper.
Each example is 2 pages: the code (usually less than 15 lines), and the table that it produces.
I am having difficult choosing landscape/portrait for each page, so every page, unfortunately, is landscape. When you upload to Google Docs from a word document on the hard drive, it throws some stuff out, like page-by-page choices for landscape/portrait

Sunday, April 1, 2012

More Examples of Vilno Table

Here I show a few examples of rather simple statistical tables. Vilno Table has an enormous worker productivity advantage over SAS for complex, customized, picky statistical table requests. But it's also easy to use for simple tables. The more complex and picky the statistical table request is, the greater the advantage that Vilno Table has over SAS (and Excel) in terms of worker productivity. (Therefore, for very simple requests, the difference in productivity is less).

Most of these examples use only summary statistics (the last example will be a one-way ANOVA). Each example has only one available dataset and essentially one analysis (Vilno Table can produce a table using multiple data sources and different analyses, but these examples are simple).

This is code that describes the available datasets. Here, there is only one dataset, the PATINFO dataset :

inputdset asc a/PATINFO site patid trtgrp gender race happy weight age ;

( trtgrp means "Treatment Group", happy is a categorical outcome variable).


This is just a two-way frequency table:

denom trtgrp ;

col trtgrp*( N % ) ;

row happy ;


This is the same thing, but add a chi-square p-value in the upper right corner:

( I add a model statement, and add the word "pvalue" to the column statement )

denom trtgrp ;

model chisq(trtgrp*happy) ;

col trtgrp*( N % ) pvalue ;

row happy ;


This is assorted summary statistics , for certain demographic subgroups:

col all gender*race [age<65] ;

row N mean(weight) std(weight) median(weight) ;


Okey-dokey, let's try one slightly more advanced example, a one-way ANOVA (well, technically "age" is a continuous covariate). Lm stands for "Linear model".

model lm( weight ~ trtgrp -1 + age ) ;

col trtgrp*est all*( est_pw("60mg"-"Placebo") pval_pw("60mg"-"Placebo") ) ;

row all ;

What you get is least-square mean for every treatment group, and just one pairwise comparison.

Let's make the row and column headers easier to read with:

label trtgrp "Treatment Group" est "Least Square Mean"

all "60mg vs Placebo" ""

est_pw "Difference" pval_pw "P-value" ;

To the dataset description code at the top, for the linear model, I'll need to add:

categorical trtgrp ;

continuous weight age ; (this extra description code was not needed for the summary statistics tables)


The above 4 examples are a lot simpler than most of the examples in the appendix of the Vilno Table Programming Language. The above examples do not show the full flexibility of Vilno. They show it's easy to produce tables that should be easy.

Just one more example, before I go, the next table is several frequency cross-tabulations, each one with a chi-square p-value on the right side, stacked vertically (each row section has a different row category, but the column category is always treatment group.


Several categorical tests, with N and % , and the p-value in the right column, like a baseline characteristic table. Again, most of the important stuff is in the model statement, the column statement, and the row statement.



model chisq(thisrowcat*trtgroup*N) ;


col (trtgroup all)*(N %) pvalue ;


row gender race age_group ;




This is fairly similar to table A1 in the Vilno Table white paper. When the current beta version has cr-modifier statements added to the parser, a later version will be able to put continuous and categorical statistics into the same column, but not yet.
(A baseline characteristic table crams a lot of stuff into the same page, so N and % must share the same column with mean and std. deviation).