• No results found

Stata Introduction, 3 h

N/A
N/A
Protected

Academic year: 2022

Share "Stata Introduction, 3 h"

Copied!
40
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

Stata Introduction, 3 h

Presented by Cecilie Dahl

Presentation, data and programs at:

https://www.med.uio.no/helsam/forskning/aktuelt/arrange menter/andre/stata-course-uio.html

(2)

Stata introduction

• General use

•Interface and menu

•Do-files and syntax

•Data handling

• Analysis

•Descriptive

•Graphs

•Bivariate

Exercises

(3)

Why Stata

• Pro

•Price

•Aimed at epidemiology (and economy)

•Many methods, growing

•Graphics

•Structured, Programmable

• Con

•File size < Memory

(4)

Smart working

• Data (.dta)

•Master file, safe

•Working file for each project

• Syntax (.do)

•Work in progress file

•Manuscript file (Table 1…, Figure 1…, Supplement)

• Output (.smcl or .log)

•Save or discard

(5)

INTERFACE

(6)

Interface Stata 12 (and 16)

Do file

Data edit

(7)

Menu

H.S. 7

(8)

Do-file example

8

New do-file: icon or Ctrl-9

Run: Mark, Ctrl-D

y

(9)

Syntax

• Examples

• mean age

• mean age if sex==1

• bysort sex: summarize age

• summarize age ,detail

9

command [varlist] [if exp] [in range] [, opts]

[bysort varlist:]

Syntax

(10)

DATA HANDLING

(11)

Export data from SPSS

•Using SPSS 14.0-

Save as, Stata Version 8 SE

11

(12)

Use and save data

• Open data

• use “C:\Course\Myfile.dta”, clear

• Describe

• describe describe all variables

• list sex age in 1/20 list obs nr 1 to 20

• Save data

• save “C:\Course\Myfile.dta” ,replace

12

(13)

Exercise 1

•Start Stata

•Open the birth data (…birth1.sav)

•Open a new syntax file (Ctrl-9)

•Describe all variables: describe.

•List the 10 first observations of weight, sex and mother’s age (mage)

•Save the syntax file for later use

5-10 min

https://www.med.uio.no/helsam/forskning/a ktuelt/arrangementer/andre/stata-course- uio.html

(14)

Descriptive

• Continuous

• Categorical

summarize weight

summarize weight, details percentiles ++

tabulate bullied

tabulate bullied, nolab show coding

(15)

Other descriptives

15

tabstat mAge, stat( N min p50 mean max) by(parity)

(16)

Generate, replace

•Index (young men)

generate index=0

replace index=1 if sex==1 & age<30

•Young/Old

generate old=(age>50) if age<.

•Serial numbers

generate id=_n

16

(17)

Recode

•Recode 1/2 into 0/1

recode sex (1=0) (2=1), gen(sex0)

•Alternative

generate sex0=sex-1

(18)

Dates

•From numeric to date (3 numeric variables into date variable)

ex: m=12, d=2, y=1987

generate birth=mdy(m,d,y) format birth %td

•From string to date (1 string variable into date variable)

ex: bstr=“02.12.1987”

generate birth=date(bstr,”DMY”) format birth %td

18

(19)

Exercise 2

• Summarize mother’s age

• Tabulate sex

• Recode sex into sex0 with categories 0, 1

• Generate new gestational age in weeks (the old is in days)

•Summarize the new variable

10 min

(20)

Missing

•Obs!!!

Represented as ”.”

Missing values are large numbers

age>30 will include missing.

age>30 if age<. will not.

•Test

replace age=0 if (age==.)

•Change

replace educ=. if educ==99

20

(21)

Describe missing

Summarize missing

Missing in tables

21

tab bullied sex, missing

misstable summarize weight sex gest missing

(22)

Exercise 3

• Tabulate missing in gestational age (gest) with the misstable command

• Tabulate gest4 versus sex and include missing

• Summarize mage if gest is greater than 260 days

•Will this include missing in gest?

•Summarize mage if gest is greater than 260 days excluding missing in gest

10 min

(23)

Help

•General

help command

findit keyword search Stata+net

•Examples

help table

findit aflogit

23

Many videos on YouTube

(24)

Summing up

Use do files

Run: Mark, Ctrl-D

Syntax

•command [varlist] [if exp] [in range] [, options]

• Missing

age>30 if age<.

generate old=(age>50) if age<.

• Help

•help describe

Oct-19 24

(25)

GRAPHICS

(26)

Twoway plots

• Syntax

•twoway (plot1, opts) (plot2, opts), opts

• One plot

•kdensity bw

•scatter bw gest

Oct-19 26

0 2000 4000 6000

Birth weight

kernel = epanechnikov, bandwidth = 102.3251

Kernel density estimate

0

200040006000

Birth weight

240 260 280 300 320 340

Gestational age

(27)

Oct-19 27

twoway (scatter bw gest) (fpfitci bw gest) (lfit bw gest)

20003000400050006000gram

250 270 290 310

days

Weight by gestational age

scatter smooth with CI line fit

(28)

Titles

Oct-19 28

10002000300040005000

ytitle

240 260 280 300 320

xtitle

note

subtitle

title

scatter bw gest, title("title") subtitle("subtitle") ///

xtitle("xtitle") ytitle("ytitle") note("note")

(29)

Exercise 4

•Make a density plot of birth weight (weight)

•Make a scatter plot of birth weight versus gestational age (gest)

Replace the outlier in gestational age (gest) with missing

Restrict the plot to gestational age greater than 250 days (hint if gest>250)

Add a linear fit line to the scatter plot to see the trend

Add a smoothing curve with confidence interval to the plot (fpfitci) to look for non-linear trend. The order of plots matters

Add a title, ytitle and xtitle to the plot

10 min

(30)

BIVARIATE ANALYSIS

(31)

Two independent samples

31

2000 3000 4000 5000 6000

Birth weight

twoway ( kdensity weight if sex==1, lcolor(blue) ) ///

( kdensity weight if sex==2, lcolor(red) )

Equal means?

Equal variance?

Do boys and girls have the same mean birth weight?

Test of equal variance:

robvar weight, by(sex)

(32)

Two independent samples test

32

ttest weight, by(sex) unequal ttest w1 w2, paired

ttest weight, by(sex) 2-sample T- test

(33)

Crosstables

33

equal proportions?

Are boys bullied as much as girls?

tabulate bullied sex, col chi2 nofreq

(34)

Exercise 5

The variable “magegr2” contains mother’s age in two groups. Do tab magegr2 and tab magegr2, nolab to find the groups and the coding. An alternative to find coding is to list all labels: label list

Make a plot of the birth weight distribution for each of the two groups of mother’s age.

Do a ttest of weight by magegr2. Are the means different?

Redo the ttest for weight>2000 to get more normal distributions.

Are the means different?

Are the p-values different?

Generate an indicator for high birth weight (>4500).

Make a table of high birth weight by gestgr2 with columns percent and chi-square test

(35)

Extra (if you have time)

•Do a help tabstat and look at the statistics options

•Do a tabstat of weight showing N min p25 p50 p75 max, by magegr2

(36)

Summing up

•Descriptive

summarize weight tabulate sex

•Graphs

twoway (plot1, opts) (plot2, opts), opts

•Bivariate

•ttest weight, by(sex)

•tabulate bullied sex, chi2

36

(37)

EXTRA MATERIAL

(38)

Save output (Log results)

•Save a portion of the analysis as a .smcl file

log using “results.smcl”

log close

(39)

Keep plots during session

•Set “tabbed” graphics

•Give each plot a name

set autotabgraphs on, permanently twoway …, name(“scatter”,replace)

(40)

Copy output

•Copy graphs to Word or PowerPoint

Save graphs in many formats, or

Right-click on a graph to copy

•Copy tables to Excel

Mark table, Ctrl-shift-C

Referanser

RELATERTE DOKUMENTER

Scatter plots of predicted gestational age using (a) the EPIC GA clock, (b) the 450 K/EPIC overlap clock, and (c) the ETD-based clock against gestational age estimated by ultrasound

Furthermore, the magnitude of associations between higher birthweight adjusted for gestational age with intelligence and executive function scores were significantly modified

Findings In this population-based cohort study of 113 227 children that used a sibling comparison approach to adjust for confounding, an association was found between early

Keywords: Pregnancy, Ggestational weight gain, Maternal and infant outcomes, Obesity, Small for gestational age, Large for gestational age, Gestational diabetes, Caesarean

Keywords: Hyperemesis gravidarum, The Norwegian Mother and Child Cohort Study, Birth weight, Gestational age, Preterm birth, Low birthweight, Small for gestational age, Apgar

A recent Europe-wide systematic review of child cohort studies has demonstrated the link between maternal education, and the risk of preterm and small for gestational age (SGA)

Furthermore, the use of cranberry was also not associated with increased risk for stillbirth/neonatal death, low birth weight, small for gestational age, preterm birth, low Apgar

We studied mild influenza and influenza antibodies in relation to birth weight and risks of pre-eclampsia, preterm birth (PTB), and small for gestational age (SGA) birth among