import pandas as pd
# 读取CSV文件
data = pd.read_csv('data.csv')
# 计算暴露组和非暴露组的发病率
exposed_disease_count = data[(data['Exposed'] == 'Yes') & (data['Diseased'] == 'Yes')].shape[0]
exposed_total_count = data[data['Exposed'] == 'Yes'].shape[0]
exposed_risk = exposed_disease_count / exposed_total_count
unexposed_disease_count = data[(data['Exposed'] == 'No') & (data['Diseased'] == 'Yes')].shape[0]
unexposed_total_count = data[data['Exposed'] == 'No'].shape[0]
unexposed_risk = unexposed_disease_count / unexposed_total_count
# 计算RR值
RR = exposed_risk / unexposed_risk
print(f"RR值为: {RR}")
在这个示例中,我们首先读取了CSV文件,然后计算了暴露组和非暴露组的发病率,最后计算了RR值。请注意,这个示例假设数据已经按照正确的格式进行了整理,并且没有缺失值或错误数据。在实际应用中,你可能需要进行数据清洗和预处理,以确保数据的质量