Bioinformaticsのお勉強: 母集団(一標本)の平均収縮期血圧は130mmHgよりも高いか!?

母標準偏差が未知の母集団の平均収縮期血圧が成人の平均血圧

ある地域に居住している20代の男性を無作為に２７人を抽出し、収縮期血圧を測定します。

この標本からこの地域に居住する20代男性の平均収縮期血圧が一般に正常値上限とされる130mmHgを上回っているかを検定したいと思います。

#Rの起動

$ R

#27人分の標本を保持するBP(Blood Pressure）というオブジェクトを生成

BP <-c(134,123,145,134,130,124,156,128,129,

118,120,142,139,127,134,145,132,132,128,139,132,134,145,120,125,129,136)

#scatter plot, dot plot, histogram,boxplotにて可視化を行う

png("120527_blood_pressure.png")

par(mfrow=c(2,2))

plot(BP, main="Blood Pressure", ylab="Blood Pressure(mmHg)", xlab="sample ID")

abline(a=130,b=0)

stripchart(BP, method="stack", pch=1,xlab="Blood pressure(mmHg)", main="Blood pressure")

hist(BP, main="Blood Pressure", xlab="Blood Pressure(mmHg)", ylab="Frequency")

boxplot(BP, main="Blood Pressure", ylab="Blood Pressure(mmHg)")

dev.off()

#ここでt.test関数のヘルプを参照する

help(t.test)

/*以下抜粋*/

Description:

Performs one and two sample t-tests on vectors of data.

Usage:

t.test(x, y = NULL,

alternative = c("two.sided", "less", "greater"),

mu = 0, paired = FALSE, var.equal = FALSE,

conf.level = 0.95, ...)

Arguments:

x: a (non-empty) numeric vector of data values.

y: an optional (non-empty) numeric vector of data values.

alternative: a character string specifying the alternative hypothesis,

must be one of ‘"two.sided"’ (default), ‘"greater"’ or

‘"less"’. You can specify just the initial letter.

mu: a number indicating the true value of the mean (or difference

in means if you are performing a two sample test).

paired: a logical indicating whether you want a paired t-test.

var.equal: a logical variable indicating whether to treat the two

variances as being equal. If ‘TRUE’ then the pooled

variance is used to estimate the variance otherwise the Welch

(or Satterthwaite) approximation to the degrees of freedom is

used.

conf.level: confidence level of the interval.

/*ここまで*/

#1サンプルのt検定を実施する

t.test(BP, mu=130, alternative="greater")

/*結果*/

One Sample t-test

data: BP

t = 1.5133, df = 26, p-value = 0.07114

alternative hypothesis: true mean is greater than 130

95 percent confidence interval:

129.6704 Inf

sample estimates:

mean of x

132.5926

/*以上*/

帰無仮説H0 : μ = 130(母平均は130である）

対立仮説H1 : μ > 130（母平均は130よも大きい）

有意水準 : α = 0.05

として、検定を行った。

その結果

p-value=0.07114　> 　有意水準　= 0.05

となり、帰無仮説H0は棄却されず、母平均は130よりも大きいとは言えないとされる。（かといって、母平均が130であるとは積極的に肯定もされない）

信頼区間は

95 percent confidence interval:　 129.6704 Inf

ということである。

この意味は、

母平均の95%信頼区間が下限129.6704, 上限が無限大であるということである。

両側検定をしてみると

t.test(BP, mu=130, alternative="two.sided")

One Sample t-test

data: BP

t = 1.5133, df = 26, p-value = 0.1423

alternative hypothesis: true mean is not equal to 130

95 percent confidence interval:

129.0710 136.1142

sample estimates:

mean of x

132.5926

同じくp値は0.05以上となるため、母平均が130よりも低いとも高いとも言えないという結論となる。

95%信頼区間は

129.0710から136.1142の間となります。

なお、t = 1.5133の導出についてですが、

以下の数式でt値は計算されます。

xバー　：　標本平均

μ0 　：　比較したい特定の値

s　　　：　標準偏差(不偏分散から計算しています)

n　　 : 　標本のサイズ

t値を定義に基づいて計算すると以下のようなRのスクリプトになります。

t.value <- (mean(BP)-130 ) / ( sd(BP) / sqrt(length(BP)) )

t.value

[1] 1.513264

自由度length(BP) - 1 = 26のt分布において、

t値が1.513263を超える確率(すなわちこれがp値)は以下のように計算される。

(1-pt(1.513264, length(BP)-1))*2

[1] 0.1422746

t.test関数で算出したものと一致します。

ブログにアップした数式はWeb Equationを用いて手書き入力からTexのコードを起こして

CODECOGSでgifファイルにしました。

Web Equation(http://webdemo.visionobjects.com/equation.html)

CODECOGS(http://www.codecogs.com/latex/eqneditor.php)

Bioinformaticsのお勉強

母集団(一標本)の平均収縮期血圧は130mmHgよりも高いか!?

自己紹介

過去のブログ♪♪♪