Best biostatistics books according to redditors

We found 95 Reddit comments discussing the best biostatistics books. We ranked the 53 resulting products by number of redditors who mentioned them. Here are the top 20.

Next page

Top Reddit comments about Biostatistics:

u/am_i_wrong_dude · 16 pointsr/medicine

I've posted a similar answer before, but can't find the comment anymore.

If you are interested in doing your own statistics and modeling (like regression modeling), learn R. It pays amazing dividends for anyone who does any sort of data analysis, even basic biostats. Excel is for accountants and is terrible for biological data. It screws up your datasets when you open them, has no version control/tracking, has only rudimentary visualization capabilities, and cannot do the kind of stats you need to use the most (like right-censored data for Cox proportional hazards models or Kaplan-Meier curves). I've used SAS, Stata, SPSS, Excel, and a whole bunch of other junk in various classes and various projects over the years, and now use only R, Python, and Unix/Shell with nearly all the statistical work being in R. I'm definitely a biased recommender, because what started off as just a way to make a quick survival curve that I couldn't do in Excel as a medical student led me down a rabbit hole and now my whole career is based on data analysis. That said, my entire fellowship cohort now at least dabbles in R for making figures and doing basic statistics, so it's not just me.

R is free, has an amazing online community, and is in heavy use by biostatisticians. The biggest downsides are

  • R is actually a strange and unpopular general programming language (Python is far superior for writing actual programs)
  • It has a steep initial learning curve (though once you get the basics it is very easy to learn advanced techniques).

    Unfortunately learning R won't teach you actual statistics.... for that I've had the best luck with brick-and-mortar classes throughout med school and later fellowship but many, many MOOCs, textbooks, and online workshops exist to teach you the basics.

    If I were doing it all over again from the start, I would take a course or use a textbook that integrated R from the very beginning such as this.

    Some other great statistical textbooks:

  • Introduction to Statistical Learning -- free legal PDF here -- I can't recommend this book enough
  • Elements of Statistical Learning -- A masterpiece of machine learning and modeling. I can't pretend to understand this whole book, but it is a frequent reference and aspirational read.

    Online classes:
    So many to choose from, but I am partial to DataCamp

    Want to get started?

  • Download R directly from its host, CRAN
  • Download RStudio (an integrated development environment for R that makes life infinitely easier) from its website (also free)
  • Fire up RStudio and type the following commands after the > prompt in the console:

    install.packages("swirl")

    library("swirl")

    swirl()

    And you'll be off an running in a built-in tutorial that starts with the basics (how do I add two numbers) and ends (last I checked) with linear regression models.

    ALL OF THAT SAID ------

    You don't need to do any of that to be a good doctor, or even a good researcher. All academic institutions have dedicated statisticians (I still work with them all the time -- I know enough to know I don't really know what I am doing). If you can do your own data analysis though, you can work much faster and do many more interesting things than if you have to pay by the hour for someone to make basic figures for you.
u/daledinkler · 9 pointsr/gis

I use R almost exclusively for my spatial analysis. Sometimes I use command line gdal stuff. Here is the book I used to get started: http://www.amazon.com/Applied-Spatial-Data-Analysis-Use/dp/1461476178

There are PDFs online.

u/kerblooee · 5 pointsr/neuro

A good one is A Handbook of Functional MRI Data Analysis by Russell Poldrack.

u/xcthulhu · 5 pointsr/math

Given your background, you could read Ken Binmore's Game Theory: A Very Short Introduction (2007). It's really short, but it assumes the reader is familiar with probability theory and a fair amount of mathematics. Binmore has another textbook Playing for Real (2007) which is goes much more in depth. It assumes the reader is familiar with linear algebra.

One of the central results of von Neumann and Morgenstern's Theory of Games and Economic Behavior (1928) is the minimax theorem. This was John von Neumann's favorite theorem from that book. John Nash generalized this in his PhD thesis in 1950. The minimax theorem establishes the existence of Nash equilibrium for zero-sum games with finite players and strategies. Nash's extended this and showed that any normal form game with finite players and strategies has an equilibrium. You might have seen the movie A Beautiful Mind which depicted John Nash working on this. If you are interested, you can read about Nash's proof in Luce and Raiffa's Games and Decisions: Introduction and Critical Survey (1957). The proof does assumes the reader is familiar with point set topology.

Outside of economics, game theory is also applied to evolutionary biology. One of the best books on evolutionary game theory is Martin Nowak's Evolutionary Dynamics: Exploring the Equations of Life (2006). You might also like John Maynard Smith's Evolution and the Theory of Games (1982). Maynard Smith assumes the reader is familiar with homogenous differential equations.

Hope this helps!

u/jjrs · 4 pointsr/statistics

Here's my favorite general, theoretical intro to Bayesian stats, by the author of the logic of science book above. Interesting to read and not too long-
http://bayes.wustl.edu/etj/articles/general.background.pdf

More...This one tries to re-teach stats from square one. It's alright, but stops short of Markov Chain Monte Carlo, which is where things get fun.
http://www.amazon.co.uk/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158/ref=sr_1_1?ie=UTF8&s=books&qid=1280142090&sr=1-1

This is the one I'm reading now, which explains bayes for people in the social sciences, and makes an effort to break down the cool stuff into simple terms. I really like the writer and its good so far-
http://www.amazon.co.uk/Bayesian-Methods-Behavioral-Sciences-Statistics/dp/1584885629/ref=sr_1_2?ie=UTF8&s=books&qid=1280142179&sr=1-2

u/tathougies · 3 pointsr/Catholicism

> This is a false idea, unless you know the standard deviation of the dataset.

Bayesian statistics bro. Here's a good book

Also, to extrapolate solely from the standard deviation, I would have to believe the distribution is normal. I have no reason to believe such a thing, and neither do you. A distribution would be an interesting measurement to cite, but seeing as you couldn't cite either the percentage you claimed in Boston (15%) nor the nationwide standard deviation, I doubt you have any information on the distribution.

u/icybrain · 3 pointsr/Rlanguage

It sounds like you're looking for time series material, but Applied Predictive Modeling may be of interest to you. For time series and R specifically, this text seems well-reviewed.

u/[deleted] · 3 pointsr/lgbt

I'm making calculations based on your behaviour, that of an authoritarian mod who claims to be an anarchist. You don't see the inherent contradiction there? And there's no way I'm telling you what university I went to, I don't see that it's relevant. If you really want to be a detective, the author of this was one of my tutors. What university did you go to?

I recommend this if you're interested in the field. These are long books, the ideas in them are fairly easy to get across in person, but it's just not feasible in a few reddit comment boxes. If you're saying you're willing to be swayed and haven't just made up your mind, I will answer specific questions. But "Explain it" is just too broad for me, sorry.

u/atleastihavemytowel · 3 pointsr/neuro

This book is fantastic. There is a PDF floating around somewhere for free you can probably find with a quick Google search.

https://www.amazon.com/Handbook-Functional-MRI-Data-Analysis/dp/0521517664

u/porourke27 · 3 pointsr/statistics

Honestly I think she selected qualitative by defaut of fear. She is courageous and smart, but this class has her a bit shook.

> "it's the university's job to teach her what she needs to succeed!"

Agreed! I appreciate this point and will help her see it that way. I think there should certainly be a early conversation with the Instructor on the concepts and expectations in this class. That would certainly focus the conversation. Setting sights on only passing the class isn't really ideal, but it is an important step. Obviously learning the material is the goal.

Thank you for the text recommendation. I found PDQ stat on Amazon.
To anyone reading along, here is the LINK

> Pretty Darn Quick Amazon Review:


By An epidemiologist
Format:Paperback

This book is the only statistics book of its type. For each section covering a specific statistical method (from simple methods to those you may not even cover in your PhD training), a concise 2-5 page summary is presented. The goal is not to enable the reader to calculate any of these statistics, but to understand conceptually what each statistic means. This is where it can fill in information other statistics texts never get to. A student (or researcher!) who can churn out factorial ANOVA results, but doesn't truly understand what they mean can turn to this book for clarity. It's simple (for statistics), it's short, it's clear, and you have to love a book that is dedicated "To the many people who have made this book both possible and necessary -- authors of other statistics books"!

u/ffffruit · 3 pointsr/epidemiology

Epidemiology is more science oriented whereas PH is more policy oriented. Both are very interesting and there's a substantial overlap. Epidemiology it self has very many subtypes such as environmental, clinical, social etc - again you can pick the one that interest's you the most.

A good introduction would either be to buy a book or do the LSHTM online course

u/randomjohn · 3 pointsr/statistics

Not general statistics, but all the sample size calculations you could never want to see can be found in this book.

u/NotDeadJustSlob · 2 pointsr/biology

Well if it is stats you are looking for then the standard in my department is Gotelli's A primer of ecological statistics. For more general biological stats look at Whitlock & Schluter and Quinn & Keough. Also don't forget the classic Biostatistical Analysis.

u/eaturbrainz · 2 pointsr/HPMOR

>"Gödel, Escher, Bach" by Douglas R. Hofstadter is the most awesome book that I have ever read. If there is one book that emphasizes the tragedy of Death, it is this book, because it's terrible that so many people have died without reading it."

Apparently I never got remotely far-enough into the book for this statement to make sense.

(I got tired of carrying that huge paperback around in my backpack.)

Lemme go get a Kindle copy.

I've heard good things about Good and Real.

>Artificial Intelligence: A Modern Approach[2] , also recommended by Yudkowsky, is the most comprehensive, state of the art introduction to the theory and practice of artificial intelligence for modern applications. It's the leading textbook in the field of artificial intelligence, used in over 1100 universities worldwide. I think it's obvious why a community read-through of this would be beneficial.

Russel and Norvig is the standard textbook for "Good Old-Fashioned AI", ie: the kind that's not at all worthy of being called "AI". It's used as a textbook in the first course in GOFAI for undergrads. It teaches fairly little programming, very little mathematics, and covers nothing of the kind of modern machine-learning techniques that actually get results these days, let alone the increasingly elegant and advanced learning techniques that are yielding good models of what cognition is.

On the textbook front, though, I can recommend that anyone with basic Calc 1+2 under their belt can go ahead and read Introduction to Bayesian Statistics to get a first taste of how "Bayesianism" actually works, and also why it hasn't taken over the world already (hint: computational concerns).

u/marshmallowpillow55 · 2 pointsr/RStudio

Someone over on r/rlanguage posted this link to a list of R help resources. As we don't know quite what level you're at, you may want to look through there to see what's applicable to you.
If you are a total novice, one site I've had recommended (and is also linked on the above blog) is datacamp. Personally I found this useful as a start to learning some of R's commands, but the first chunk of the course left me unable to actually make or run a program as it didn't fill in the basics (eg what a working directory is, how to actually download R). So I used that website in conjunction with the book getting started with R - whilst it is targeted at biologists, the first half is certainly applicable to anyone getting to grips with R.
You'll have to decide yourself whether it's worth spending money on books if you'll only be using them for this one class or whether it would be better trying out some of the free online resources (or seeing if you find free ebook versions!).
As u/fang_xianfu said, a specific question will probably give you more targeted help and advice, so ask away!

u/adventuringraw · 2 pointsr/learnmachinelearning

if you're doing this to help prepare to switch careers, look at industries and companies you might be interested in. Every vertical has different tech stack choices that are common. Medicine has a lot of SAS, pharmaceutical researchers I've met all use R, main industry and research at this point is mostly Python. Python gets you the most bang for your buck. If you need to step outside ML and throw together a back end DB, a REST API and a front end to glue the whole thing together or whatever, Python's just as useful there as it will be with ML. I don't use R, but from what I hear it's much less versatile. The Stats libraries for R are a lot more mature though apparently, so if you want to get into doing some more intense statistical stuff, I've heard Python is a little less friendly. I haven't run into any of those limitations, but I've been more playing around with RL and stuff, and doing less intense statistical analysis with rigorous confidence bounds or whatever.

For forecasting from historical data, you're looking at time series. Unfortunately I don't know a ton about time series modeling yet. It's much more complicated than a situation where you're assuming N iid draws from a stationary distribution (the 'typical' entry point for classification and such that you see in supervised machine learning).

Keeping in mind that I have no business giving you advice where to start because I haven't made the trek yet myself, I've heard good things about Time Series Analysis and Its Applications. It's a grad level stats book though, so I hope you aren't joking about your math background, haha. The examples in that book are all in R too, as a head's up.

For a slightly easier (but still standard) introduction to the topic, I've also heard Wei's Time Series Analysis is decent. If you look around for a good introduction to multivariable time series analysis though, I'm sure you could find a lot of resources and judge for yourself what would most fit your needs. If you did pick one of those two books to pound out, I suspect you'll have a radically better idea how to go the rest of the way and get into practical application. As you're getting into the theory (whatever resource you use), I'd highly recommend picking a few datasets you're interested in (Kaggle might be a good source, to go with whatever you care to get into for your own reasons) and as you go, try applying the various methods you're learning on those few different datasets to get some sense of how it works and why. Pro-tip: one or two of your go-to toy datasets should be generated yourself with some simple to understand function to help give a really easily understandable case to play with, where your intuition can still hold up. y(t) = sin(t) +kt + N(0,b) maybe, or some simple dynamic process of the form y^t+1 = f(y^t ).

But either way, make sure you're rolling up your sleeves and cracking your assumptions against actual data in code to make sure you get the idea. All theory and no practical makes Jack a dull boy.

Edit: if you want a more broad introduction without necessarily having the rigorous focus on time series forecasting, 'applied predictive modeling' and 'introduction to statistical learning' are both good big picture intros. The new hands on machine learning book is good too, but more narrow and less comprehensive. Elements of Statistical Learning is kind of the defacto standard reference text going over all the common algorithms from a mathematical perspective. If you have the mathematical maturity to tackle ELS, that'd be a great way to start to get a deep foundation in the theoretical ideas across ML as a whole, though obviously none of that is going to be time series specific.

u/jacobolus · 2 pointsr/math

Sounds like a solid book for someone with an undergraduate math background,
https://amzn.com/038794415X
http://www.springer.com/us/book/9780387944159

u/ennervated_scientist · 2 pointsr/labrats

The analysis of biological data is fantastic for foundation stuff. Really recommend it.

https://www.amazon.com/Analysis-Biological-Data-Michael-Whitlock/dp/0981519407

u/haineus · 2 pointsr/statistics

You should read a text on time series analysis. I recommend this one: http://www.amazon.com/Time-Series-Analysis-Its-Applications/dp/144197864X

u/saruwatari_takumi · 2 pointsr/statistics

It's been a few years since I read it, but I really enjoyed Harvey Motulsky's Intuitive Biostatistics. He has a very good writing style and explains concepts clearly and broadly, rather than going into unnecessary details.

u/minorsecond · 1 pointr/statistics

I really liked this book, for one. It covers point pattern analysis, which I suspect will interest you.

u/batkarma · 1 pointr/Economics

I've never really found a probability book that I love. Here is the one I had for undergrad:

Pleasures of Probability

It's verbose, but provides excellent coverage of the major stuff. Here's three free ones, but it looks like they jump in a little too quickly:

http://www.math.uiuc.edu/~r-ash/BPT.html

https://web.math.princeton.edu/~nelson/books/rept.pdf

Probability Theory, the logic of science

And an MIT course:

http://ocw.mit.edu/courses/mathematics/18-440-probability-and-random-variables-spring-2011/index.htm

You basically want a strong understanding of:

Conditional Probability, Baye's Thm and it's use, Expectation. And to be familiar with the Law of Large Numbers, Central Limit Thm and Moment Generating functions, and the use of all three.

Freund's Mathematical Statistics is the go-to book for mathematical statistics. It requires a strong grasp of integration techniques (including changing coordinate systems through substitution) and probability.

The graduate level econometrics texts most commonly used are Greene's Econometric Analysis and Hayashi's Econometrics (I have a slight preference for Hayashi)

u/felis-parenthesis · 1 pointr/slatestarcodex

First Idea: Language. Some-one might invent a Constructed Language or conlang, that helps thinking and communicating. Life is more complicated than language. We should reject both mistake theory and conflict theory. Politics is nasty due to linguistic poverty. Our few words muddle together different things leading to quarrels.

Examples avoiding culture war: 1)The opening paragraph here 2)the word error as used by journalists reporting on medical tests. We are really interested in positive predictive value and negative predictive value. We could work them out for ourselves if we knew the false positive rate and the false negative rate and the prevalence. Here our language is rich enough that there are words for the concepts that I claim we don't have words for. That is good, because it allows me to write down an example using words, and I can fall back on pointing out that natural language only has error. The other words and phrase belong to unnatural language :-)

Second Ideal: Quantitative Dynamic Sociology. Think about Quantitative Ecological Theory All those foxes eating hares until hares are rare and the foxes starve and die, and the hare population revives and the few remaining foxes put on weight and eventually start breeding again. Like that, but for ideas, like marriage, divorce, income tax, minimal wages,... Why do they wax and wane?

Peter Turchin is on to this and calls it Cliodynamics. Great, but I fear premature. I'm expecting a book like Evolution and the Theory of Games which gets criticized for unrealistic models. First we need some-one to come up with a compendium of toy models for Quantitative Dynamic Sociology to show what it would even look like. Then, in 2119, the great mind can revolutionize the new field with models that actually work.

u/deck13 · 1 pointr/baseball

Is this what you are requesting?

Here is the copied version:

That quote from Fangraphs is wrong and I'm sorry that you are so dogmatic in your support for them. They are wrong for the same reasons that I pointed out, they can't normalize across a changing talent distribution (I am far from the only person to voice criticism in this direction, as an example see Bill Jame's take in: http://www.espn.com/blog/sweetspot/post/_/id/27050/what-we-talk-about-when-we-talk-about-war. Also see https://www.researchgate.net/publication/247739058_Concentration_of_Playing_Talent_Evolution_in_Major_League_Baseball/download which implicitly supports this claim) As a hypothetical example, suppose that someone calculated WAR for everyone in the history of the Japanese league and then adjusted it to the 162 game schedule. Suppose further that the highest WAR was obtained from someone from the Japanese league. Would you then think that that is the best season ever? I am not meaning this as a realistic example, just a thought experiment in what WAR measures and how it can be misapplied.

Your population dynamic calculations ignore segregation, the amount of people who wanted to play in the MLB/minors but couldn't, and they ignore the global talent pool that the MLB draws from. A proper accounting of these things will switch your conclusion. You figure talent pool numbers by analyzing Latin American and other nation's census data. I have done this more than two years ago and have calculated that the chances of observing 11 or more people to play baseball before 1950 to populate a top 25 all time list is less than 1/500 (this particular analysis corresponds to ESPN's list, WAR is a worse offender).

You dismiss my nonstationary stochastic process claim quickly and swiftly, I would would like to see more substantive conceptual and analytical reasons for why you disagree. Note that peer-reviewed published literature agrees with me on the subject of nonstationarity in this context, see: https://link.springer.com/article/10.1140/epjb/e2010-10647-1 (I wouldn't draw too much else from this paper, because AFAIK no statistician worth his or her salt would detrend a time series in the manner that they present (https://stats.stackexchange.com/questions/120270/how-do-i-detrend-time-series, https://www.amazon.com/dp/144197864X/?tag=stackoverflow17-20, https://www.amazon.com/dp/0470540648/?tag=stackoverflow17-20, https://machinelearningmastery.com/time-series-trends-in-python/)) and see https://www.researchgate.net/publication/247739058_Concentration_of_Playing_Talent_Evolution_in_Major_League_Baseball/download again.

I read your post on WAR and disagreed with your assertion that you could compare players across eras using WAR when it was written. In fact, you found my "method" to be an interesting correction. Your exact quote (as written around a month ago) was: "That is an interesting method to correcting for that, and I would be interested to see that applied as weights to fWAR over the last 100 or years. WAR still is probably the best way to compare a players baseball talent to a replacement level player of that time. Now if you brought Ruth or Williams into today's game, they would probably not post the same stats, true. But adjusting for that is practically impossible, although I am interested in your (or the one you found) method as described above." This is an example of something that could be done and yes, it is my method.

Source: Have a PhD in Statistics and teach it at the college level.

Edit: for your records, here is my disagreement to your original post on WAR https://old.reddit.com/r/baseball/comments/96lkfm/war_explained/e41or6o/ (read how I am approaching the problem to account for the underlying innate talent distribution, not the measured talent pool which is artificially constrained by the number of minor league slots) and here is your response to my disagreement https://old.reddit.com/r/baseball/comments/96lkfm/war_explained/e428mfz/

Edit 2: edited content to make it more readable and added some sourcing

u/editorijsmi · 1 pointr/statistics

you can check the following books

  1. Bayesian Methodology: An overview with the help of R software: Tool for data science professionals

    https://www.amazon.com/dp/B07QCHTR54

  2. Essentials of Bio-Statistics: An overview with the help of Software

    https://www.amazon.com/dp/B07GRBXX7D
u/Gunwild · 1 pointr/technology

Last semester my pharmacy school program made us self learn/review stats with this book.

Worst stats book ever. I also hate seeing math scribbled on powerpoints,

I think I've done enough complaining for one day...

u/yarasa · 1 pointr/statistics

I have used the following two books:

  1. Good introduction, with a discussion of frequentist vs Bayesian statistics:

    www.amazon.com/gp/aw/d/0470141158?pc_redir=1411138170&robot_redir=1

  2. PDF available online, more machine learning oriented:

    http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=Brml.HomePage?from=Main.Textbook
u/geodude247 · 1 pointr/gis

Have you tried the packages spdep, spatstat, gstat? In the class I took on this subject, we used these packages along with maptools and GISTools to avoid Arc entirely. This book was our reference:
https://www.amazon.com/Introduction-Spatial-Analysis-Mapping/dp/1446272958

If I'm not mistaken, the package spdep was developed by the authors of these books:
https://www.amazon.com/Applied-Spatial-Data-Analysis-Use/dp/1461476178
https://www.amazon.com/Spatial-Statistics-Geostatistics-Applications-Information/dp/1446201740

Were you instructed to use geoRglm?

u/HowAboutNitricOxide · 1 pointr/medicalschool

Intuitive Biostatistics by Harvey Motulsky

u/pchiodo · 1 pointr/techsupport

This is bogus download link. It is just a free offer mill and will not legally let you download this book. This is a $65 book sold on Amazon, and appears to be a college text.

http://www.amazon.com/Intuitive-Biostatistics-Nonmathematical-Statistical-Thinking/dp/0199946647

Think you'll just need to buy it or find someone to borrow it from.

u/The_Golden_Image · 1 pointr/techsupport

We're not here to help you break the law.

Buy the book

u/loveless26 · 1 pointr/slavelabour

looking for a pdf version of this textbook https://www.amazon.com/Practice-Statistics-Life-Sciences-ebook/dp/B0787CQW6S

will pay $5-10

u/GhostGlacier · 1 pointr/statistics

If you're just starting out I might suggest the following websites for an intuitive understanding of statistics. I think they're better than most books for visualizing and explaining the fundamentals.

https://www.youtube.com/channel/UCFrjdcImgcQVyFbK04MBEhA

https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw

https://statquest.org/video-index/

http://www.bcfoltz.com/blog/stats-101/

As far as books go for intuitively understanding the basics: there's PDQ Statistics, I also like the stats for dummies books.

u/lemurlemur · 1 pointr/datascience

Biostatistics, The Bare Essentials is excellent. Very practical statistics explained in fairly simple terms. Obviously this is skewed toward statistical problems from biology, but it's fairly easy to extrapolate to other data science problems.

u/The_Last_Raven · 0 pointsr/biology

Schrödinger's book has been reccomended to me as has [On Growth and Form by D'Arcy Wentworth Thompson] (http://www.amazon.com/Growth-Form-Complete-Revised-Biology/dp/0486671356).

The original was in like the early 1900s but updated versions should be fine. On Growth and Form is more for those wondering about mathematics in biology though.

I'm not too clear on what angle you want, but often you'll find that Bio texts are woefully out of date in many areas if you are looking at something in particular.

The Cell is also a good book (and free as an electronic resource at many universities).