background preloader

VassarStats: Statistical Computation Web Site

VassarStats: Statistical Computation Web Site
Related:  Epidemiology & BiostatisticsStatistics

Sample Size Calculator - Confidence Level, Confidence Interval, Sample Size, Population Size, Relevant Population - Creative Research Systems This Sample Size Calculator is presented as a public service of Creative Research Systems survey software. You can use it to determine how many people you need to interview in order to get results that reflect the target population as precisely as needed. You can also find the level of precision you have in an existing sample. Before using the sample size calculator, there are two terms that you need to know. These are: confidence interval and confidence level. If you are not familiar with these terms, click here. Enter your choices in a calculator below to find the sample size you need or the confidence interval you have. Sample Size Calculator Terms: Confidence Interval & Confidence Level The confidence interval (also called margin of error) is the plus-or-minus figure usually reported in newspaper or television opinion poll results. The confidence level tells you how sure you can be. Factors that Affect Confidence Intervals Sample sizePercentagePopulation size Sample Size Percentage

Interactive Statistical Calculation Pages Interactive Statistical Calculation Pages Sample Size Calculator by Raosoft, Inc. If 50% of all the people in a population of 20000 people drink coffee in the morning, and if you were repeat the survey of 377 people ("Did you drink coffee this morning?") many times, then 95% of the time, your survey would find that between 45% and 55% of the people in your sample answered "Yes". The remaining 5% of the time, or for 1 in 20 survey questions, you would expect the survey response to more than the margin of error away from the true answer. When you survey a sample of the population, you don't know that you've found the correct answer, but you do know that there's a 95% chance that you're within the margin of error of the correct answer. Try changing your sample size and watch what happens to the alternate scenarios. That tells you what happens if you don't use the recommended sample size, and how M.O.E and confidence level (that 95%) are related. To learn more if you're a beginner, read Basic Statistics: A Modern Approach and The Cartoon Guide to Statistics.

Downloadable Sample SPSS Data Files Downloadable Sample SPSS Data Files Data QualityEnsure that required fields contain data.Ensure that the required homicide (09A, 09B, 09C) offense segment data fields are complete.Ensure that the required homicide (09A, 09B, 09C) victim segment data fields are complete.Ensure that offenses coded as occurring at midnight are correctEnsure that victim variables are reported where required and are correct when reported but not required. Standardizing the Display of IBR Data: An Examination of NIBRS ElementsTime of Juvenile Firearm ViolenceTime of Day of Personal Robberies by Type of LocationIncidents on School Property by HourTemporal Distribution of Sexual Assault Within Victim Age CategoriesLocation of Juvenile and Adult Property Crime VictimizationsRobberies by LocationFrequency Distribution for Victim-Offender Relationship by Offender and Older Age Groups and Location Analysis ExamplesFBI's Analysis of RobberyFBI's Analysis of Motor Vehicle Theft Using Survival Model

R: The R Project for Statistical Computing Data & Documentation | YRBSS | Adolescent and School Health | CDC Skip directly to search Skip directly to A to Z list Skip directly to navigation Skip directly to page options Skip directly to site content Get Email Updates To receive email updates about this page, enter your email address: CDCDASH HomeDataYRBSSData & Documentation YRBSS Data & Documentation Recommend on Facebook Tweet On This Page Youth Risk Behavior Survey (YRBS) data are available in two file formats: Access® and ASCII. New Sexual Minority Data are Now Available. Combined YRBS Datasets and Documentation The combined YRBS dataset includes national, state, and large urban school district data from selected surveys from 1991-2015. National dat (zip)( States A-M dat (zip)( States N-Z (zip)( Top of Page National YRBS Datasets and Documentation Data SPSS Syntax: sps

The R Trader » Blog Archive » BERT: a newcomer in the R Excel connection A few months ago a reader point me out this new way of connecting R and Excel. I don’t know for how long this has been around, but I never came across it and I’ve never seen any blog post or article about it. So I decided to write a post as the tool is really worth it and before anyone asks, I’m not related to the company in any way. BERT stands for Basic Excel R Toolkit. It’s free (licensed under the GPL v2) and it has been developed by Structured Data LLC. At the time of writing the current version of BERT is 1.07. In this post I’m not going to show you how R and Excel interact via BERT. How do I use BERT? My trading signals are generated using a long list of R files but I need the flexibility of Excel to display results quickly and efficiently. Use XML to build user defined menus and buttons in an Excel file.The above menus and buttons are essentially calls to VBA functions.Those VBA functions are wrapup around R functions defined using BERT. Prerequisite Step by step guide You’re done!

Descriptive Statistics - Free Statistics and Forecasting Software (Calculators) v.1.2.1 To cite Wessa.net in publications use:Wessa, P. (2019), Free Statistics Software, Office for Research Development and Education, version 1.2.1, URL © All rights reserved. Academic license for non-commercial use only. Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. Software Version : 1.2.1Algorithms & Software : Patrick Wessa, PhDServer : www.wessa.net About | Comments, Feedback & Errors | Privacy Policy | Statistics Resources | Wessa.net Home

Measuring Association in Case-Control Studies All the examples above were for cohort studies or clinical trials in which we compared either cumulative incidence or incidence rates among two or more exposure groups. However, in a true case-control study we don't measure and compare incidence. There is no "follow-up" period in case-control studies. In the module on Overview of Analytic Studies we considered a rare disease in a source population that looked like this: This view of the population is hypothetical because it shows us the exposure status of all subjects in the population. Another way of looking at this association is to consider that the "Diseased" column tells us the relative exposure status in people who developed the outcome (7/6 = 1.16667), and the "Total" column tells us the relative exposure status of the entire source population (1,007/5,640 = 0.1785). The Odds Ratio The relative exposure distributions (7/6) and (10/56) are really odds, i.e. the odds of exposure among cases and non-diseased controls.

Introduction to Principal Component Analysis (PCA) - Laura Diane Hamilton Principal Component Analysis (PCA) is a dimensionality-reduction technique that is often used to transform a high-dimensional dataset into a smaller-dimensional subspace prior to running a machine learning algorithm on the data. When should you use PCA? It is often helpful to use a dimensionality-reduction technique such as PCA prior to performing machine learning because: Reducing the dimensionality of the dataset reduces the size of the space on which k-nearest-neighbors (kNN) must calculate distance, which improve the performance of kNN. What does PCA do? Principal Component Analysis does just what it advertises; it finds the principal components of the dataset. Can you ELI5? Let’s say your original dataset has two variables, x1 and x2: Now, we want to identify the first principal component that has explains the highest amount of variance. Let's say we just wanted to project the data onto the first principal component only. Here is a picture: You can think of this sort of like a shadow.

Free Statistical Software Unix operating systems. The R Project for Statistical Computing full featured, very powerful Analysis Lab Basic analyses, good for teaching. A nice collection of small programs for specific types of analyses.) DataPlot Includes scientific visualization, statistical analysis, and non-linear modeling. MacAnova Not just for Macs, and not just ANOVA BrightStat Basic analyses including many non-parametric tests.

THE DECISION TREE FOR STATISTICS The material used in this guide is based upon "A Guide for Selecting Statistical Techniques for Analyzing Social Science Data," Second Edit ion, produced at the Institute for Social Research, The University of Michigan, under the authorship of Frank M. Andrews, Laura Klem, Terrence N. Davidson, Patrick O'Malley, and Willard L. Rodgers, copyright 1981 by The University of Michigan, All Rights Reserved. The Decision Tree helps select statistics or statistical techniques appropriate for the purpose and conditions of a particular analysis and to select the MicrOsiris commands which produce them or find the corresponding SPSS and SAS commands. Start with the first question on the next screen and choose one of the alternatives presented there by selecting the appropriate link. The "Statistics Programs" button provides a table of all statistics mentioned which can be produced by MicrOsiris, SPSS, or SAS and the corresponding commands for them. GlossaryReferences

Do Faster Data Manipulation using These 7 R Packages Introduction Data Manipulation is an inevitable phase of predictive modeling. A robust predictive model can’t be just be built using machine learning algorithms. But, with an approach to understand the business problem, the underlying data, performing required data manipulations and then extracting business insights. Among these several phases of model building, most of the time is usually spent in understanding underlying data and performing required manipulations. This would also be the focus of this article – packages to perform faster data manipulation in R. What is Data Manipulation ? If you are still confused with this ‘term’, let me explain it to you. Actually, the data collection process can have many loopholes. At times, this stage is also known as data wrangling or data cleaning. Different Ways to Manipulate / Treat Data: There is no right or wrong way in manipulating data, as long as you understand the data and have taken the necessary actions by the end of the exercise. #or

Related: