The best kittens, technology, and video games blog in the world.

Thursday, April 19, 2007

Dieting for geeks

Toefel has his own 'food chain'... by Sibi from flickr (CC-NC-ND)I have no idea if it's going to be useful for anyone or not. I played a bit with USDA Nutrient Database, trying to score different foods on their "healthiness". There is a very widespread belief that eating has an enormous impact on health. Different sources suggests eating more or less of certain foods, or getting more or less of certain nutrients. Such advice tends to have very little basis in hard evidence. What is basically known is that very low intake of vitamins and minerals, and that diet has a moderate impact on cardiovascular health. There's hardly any evidence associating high or very high intakes of particular vitamins, minerals, or other nutrients with major improvement in health, and little evidence that dieting can have more than just a minor effect on cancer or diseases other than cardiovascular diseases. If you're interested in dieting and health, probably the first thing you should read is paper "Diet quality and major chronic disease risk in men and women: moving toward improved dietary guidance". The paper explored some alternative indexes of healthy eating, knowing from previous research that the USDA Food Pyramid" failed miserably at predicting anything.

After adjusting for other risk factors in the multivariate adjusted analysis, men in the highest quintile of AHEI scores had a 39% lower risk of CVD than did men in the lowest quintile (RR = 0.61; 95% CI: 0.49, 0.75), whereas men in the highest quintile of RFS had a 23% reduction in CVD risk (RR = 0.77; 95% CI: 0.64, 0.93). Neither score predicted cancer risk, after multivariate adjustment.
The associations of both the AHEI score and RFS with risk of major chronic disease, CVD, and cancer in women are shown in Table 5. Overall, the findings were weaker than those for men. The AHEI score predicted a weak but significant reduction in major chronic disease risk in our multivariate models (RR = 0.89; 95% CI: 0.82, 0.96; P = 0.008). AHEI scores in the highest quintile compared with the lowest quintile were associated with a 28% lower risk of CVD in women (RR = 0.72; 95% CI: 0.60, 0.86; P < 0.001). As with men, we observed no significant associations between AHEI score and cancer risk. All models evaluating the RFS in women were nonsignificant after multivariate adjustment.
So AHEI only moderately correlates with risk of CVD, and with very little else. As indexes which measure whole diet get such weak results, there's really little hope that measuring individual nutrient intakes, or individual food products will predict anything at all. Yet, just for fun, here's a tool for building such indexes. Just don't consider the results meaningful.
# The data can be downloaded from the following URLs:
# http://www.nal.usda.gov/fnic/foodcomp/Data/SR19/asc/FD_GROUP.txt
# http://www.nal.usda.gov/fnic/foodcomp/Data/SR19/asc/FOODS_DES.txt
# http://www.nal.usda.gov/fnic/foodcomp/Data/SR19/dnload/sr19abbr.zip
# * (Unpack ABBREV.txt file from the zip)

def parse_sr19(file_name)
fh = File.open(file_name)
fh.each{|line|
fields = line.chomp.split(/\^/).map{|entry| if entry =~ /\A~(.*)~\Z/ then $1 else entry.to_f end }
yield(*fields)
}
end

$food_group = {}
parse_sr19('FD_GROUP.txt') {|food_id, name|
$food_group[food_id] = name
}

$food_des = {}
parse_sr19('FOOD_DES.txt') {|food_id, food_group_id, long_dsc, *rest|
$food_des[food_id] = [$food_group[food_group_id], long_dsc]
}

food = {}
parse_sr19('ABBREV.txt') {|*fields|
ndb_no     = fields[0]

group, dsc = $food_des[ndb_no]

short_dsc  = fields[1]
energy     = fields[3]
next if energy == 0.0

# Conversion factor:
# * Assume 2000.0 kcal diet
# *
cf = 2000.0 / energy

# Score in RDAs for male, 19-30y
# * http://www.iom.edu/Object.File/Master/21/372/0.pdf
ca         = fields[10]*cf/1000.0 # mg -> RDA
fe         = fields[11]*cf/8.0    # mg -> RDA
mg         = fields[12]*cf/400.0  # mg -> RDA
phosphorus = fields[13]*cf/700.0  # mg -> RDA
k          = fields[14]*cf/4700.0 # mg -> RDA
na         = fields[15]*cf/1500.0 # mg -> RDA
zn         = fields[16]*cf/11.0   # mg -> RDA
cu         = fields[17]*cf/900.0  # mg -> RDA
mn         = fields[18]*cf/2.3    # mg -> RDA
se         = fields[19]*cf/55.0   # ug -> RDA
vit_c      = fields[20]*cf/90.0   # mg -> RDA
thiamin    = fields[21]*cf/1.2    # mg -> RDA
riboflavin = fields[22]*cf/1.3    # mg -> RDA
niacin     = fields[23]*cf/16.0   # mg -> RDA
panto_acid = fields[24]*cf/5.0    # mg -> RDA
vit_b6     = fields[25]*cf/1.3    # mg -> RDA
folate     = fields[26]*cf/400.0  # ug -> RDA
vit_b12    = fields[30]*cf/2.4    # ug -> RDA
vit_a_rae  = fields[32]*cf/900.0  # ug rae -> RDA
vit_e      = fields[34]*cf/15.0   # mg -> RDA
vit_k      = fields[35]*cf/120.0  # ug -> RDA

# More than 5x RDA density doesn't give any extra score
score_vits = [vit_c,thiamin,riboflavin,niacin,panto_acid,vit_b6,folate,vit_b12,vit_a_rae,vit_e,vit_k,
          ca,fe,phosphorus,k,na,zn,cu,mn,se].
          map{|e| if e > 5.0 then 5.0 else e end}.
          inject{|a,b| a+b}

vitamins = [[vit_c,thiamin,riboflavin,niacin,panto_acid,vit_b6,folate,vit_b12,vit_a_rae,vit_e,vit_k].map{|e| (100*e).round*0.01}].map{|x|
       "#{x[8]}A #{x[1]}B1 #{x[2]}B2 #{x[3]}B3 #{x[4]}B5 #{x[5]}B6 #{x[6]}B9 #{x[7]}B12 #{x[0]}C #{x[9]}E #{x[10]}K"}[0]
minerals = [[ca,fe,phosphorus,k,na,zn,cu,mn,se].map{|e| (100*e).round*0.01}].map{|x|
       "#{x[0]}Ca #{x[1]}Fe #{x[2]}P #{x[3]}K #{x[4]}Na #{x[5]}Zn #{x[6]}Cu #{x[7]}Mn #{x[8]}Se"}[0]

protein    = fields[4]*cf  # g
lipid      = fields[5]*cf  # g
carb       = fields[7]*cf  # g
fiber      = fields[8]*cf  # g
fa_sat     = fields[41]*cf # g
fa_mono    = fields[42]*cf # g
fa_poly    = fields[43]*cf # g
choles     = fields[44]*cf/1000.0 # g

# And now the scoring
score = 0.3 * protein + fa_poly - fa_sat + 3 * score_vits - 10.0 * choles

fa_poly = fa_poly.round
fa_mono = fa_mono.round
fa_sat  = fa_sat.round
choles  = (10.0*choles).round*0.1

food[group] ||= []
food[group] << [-score, "* #{dsc}
* Score: #{score}
* Fats: #{fa_poly}/#{fa_mono}/#{fa_sat}/#{choles}
* VM-Score: #{score_vits}
* Vitamins: #{vitamins}
* Minerals: #{minerals}
"]
}

food.to_a.sort.each{|group, foods|
puts "Group: #{group}"
puts foods.sort.map{|sc,dsc| dsc}
}
The script parses USDA data and converts it to useful format, namely "how many times the RDA would you get if you ate only such food, 2000 kcal/day". Vitamins and minerals are in RDA, polyunsaturated, monounsaturated, and saturated fatty acids, protein and cholesterol in grams/day, assuming only this particular type of food is eaten. Scores of this kind are pretty useful, as mixing different foods simply corresponds to computing a weighted average. To avoid giving overly high scores to single-nutrient foods, nutrient content above 5x RDA is ignored. The output with the default scoring function looks something like that:
Group: Ethnic Foods
* Buffalo, free range, top round steak, raw (Shoshone Bannock)
* Score: 259.922257628
* Fats: 2/9/8/0.0
* VM-Score: 45.4825033844174
* Vitamins: 0.0A 2.58B1 5.05B2 8.74B3 3.07B5 12.12B6 0.0B9 9.68B12 0.0C 0.0E 0.0K
* Minerals: 0.06Ca 6.59Fe 5.77P 1.53K 0.58Na 5.51Zn 0.0Cu 0.1Mn 2.57Se
* Caribou, shoulder meat, dried (Alaska Native)
* Score: 253.402785187772
* Fats: 3/9/12/1.2
* VM-Score: 47.6705470921111
* Vitamins: 0.09A 1.92B1 7.38B2 6.83B3 6.13B5 2.83B6 0.15B9 46.43B12 0.0C 0.03E 0.0K
* Minerals: 0.1Ca 10.15Fe 5.8P 1.27K 0.97Na 6.31Zn 0.01Cu 0.35Mn 4.94Se
Group: Fast Foods
* McDONALD'S,  Hamburger
* Score: 64.1448551547567
* Fats: 2/25/23/0.2
* VM-Score: 19.4689199722205
* Vitamins: 0.0A 1.63B1 1.46B2 2.25B3 0.0B5 0.0B6 1.27B9 2.74B12 0.05C 0.0E 0.0K
* Minerals: 0.96Ca 2.62Fe 1.21P 0.34K 2.68Na 1.35Zn 0.0Cu 0.89Mn 0.0Se
Group: Baked Products
* ARCHWAY Home Style Cookies, Coconut Macaroon
* Score: -74.9294687966982
* Fats: 4/7/87/0.0
* VM-Score: 1.66698866197016
* Vitamins: 0.0A 0.07B1 0.2B2 0.06B3 0.0B5 0.0B6 0.05B9 0.0B12 0.0C 0.0E 0.0K
* Minerals: 0.02Ca 0.45Fe 0.0P 0.11K 0.7Na 0.0Zn 0.0Cu 0.0Mn 0.0Se
Two more things - a lot of data is missing. Usually the amount of nutrient for which there's no data is negligible, so the score isn't be affected, but that's not always so. And there's a huge subject of bioavailability, which the data completely ignores, even though the difference can be quite significant.

No comments: