(Part 2) Top products from r/statistics
We found 63 product mentions on r/statistics. We ranked the 465 resulting products by number of redditors who mentioned them. Here are the products ranked 21-40. You can also go back to the previous section.
21. Probability and Statistical Inference (9th Edition)
Sentiment score: 2
Number of reviews: 3
Written by three veteran statisticians, this applied introduction to probability and statistics emphasizes the existence of variation in almost every process, and how the study of probability and statistics helps us understand this variation.
22. All of Statistics: A Concise Course in Statistical Inference (Springer Texts in Statistics)
Sentiment score: 2
Number of reviews: 3
23. The Theory That Would Not Die: How Bayes' Rule Cracked the Enigma Code, Hunted Down Russian Submarines, and Emerged Triumphant from Two Centuries of Controversy
Sentiment score: 1
Number of reviews: 3
Yale University Press
24. Statistical Models: Theory And Practice
Sentiment score: 2
Number of reviews: 3
Used Book in Good Condition
25. Probability and Statistics for Engineering and the Sciences
Sentiment score: 1
Number of reviews: 3
Used Book in Good Condition
26. Doing Bayesian Data Analysis: A Tutorial with R and BUGS
Sentiment score: 2
Number of reviews: 3
28. Introductory Statistics with R (Statistics and Computing)
Sentiment score: 1
Number of reviews: 3
Springer
29. Schaum's Outline of Probability and Statistics, 4th Edition: 897 Solved Problems + 20 Videos (Schaum's Outlines)
Sentiment score: 1
Number of reviews: 3
McGraw-Hill
30. All of Statistics: A Concise Course in Statistical Inference (Springer Texts in Statistics)
Sentiment score: 2
Number of reviews: 3
Springer
31. A First Course in Bayesian Statistical Methods (Springer Texts in Statistics)
Sentiment score: 2
Number of reviews: 3
32. Everydata: The Misinformation Hidden in the Little Data You Consume Every Day
Sentiment score: 1
Number of reviews: 3
33. Naked Statistics: Stripping the Dread from the Data
Sentiment score: 2
Number of reviews: 3
W W Norton Company
34. Using Multivariate Statistics (5th Edition)
Sentiment score: 2
Number of reviews: 3
37. Introduction to Bayesian Statistics, 2nd Edition
Sentiment score: 1
Number of reviews: 2
> she didn't know how useful it would be
probably more employable than geography
> Do you guys have any recommendations on where to develop my knowledge and skills?
There's a bunch of free and inexpensive stuff around .. but there's also a lot of bad free/inexpensive stuff around; you have to be a bit discerning (which is hard when you're trying to learn it).
It might sound a bit old-school but I'd suggest going to a university library and finding some decent stats texts; you probably want to avoid the stuff that says "11th edition".
Find several that you like and work with those for a while
Some you might look for:
Statistics, Freedman, Pisani & Purves (any edition)
Introduction to the Practice of Statistics, Moore & McCabe (5th edition or earlier)
For a bit of theory (you'll need a bit of mathematics for this but not a ton of it):
Introduction To The Theory Of Statistics, Mood Graybill And Boes
These are all old books. You should be able to get them second hand for cheap, or read them in a library. They'll be a good grounding, but you'll need to be able to ask questions as well.
Places like this one and stats.stackexchange.com can be handy resources. I've seen determined people teach themselves a lot of statistics with only a bit of guidance so it can certainly be done.
> Do I need programming, if so, what would be the best programming language to learn?
It would be best to learn some, yes, because modern statistics relies on it heavily. You don't necessarily have to do it immediately but getting an early start (and using it to help with learning stats) will be better than leaving it really long.
Two main things are widely used ... R and Python. Both are free. The second is more of a mainstream programming language, the first is a statistics package as well as a more specialized language
Learn one or both; my own suggestion would be to try R but other people may have different advice.
If you want to be a programmer rather than a statistician who uses code to solve statistical problems, python would be the better choice.
As you wish to get into applied statistics (i.e. actually analyzing data), you'll need software. I'd strongly recommend learning and using R because it's completely free and incredibly powerful.
Here are some resources for learning statistics using R:
Then, these websites provide very valuable resources for doing statistics with R:
Hope that helps.
Books:
"Doing Bayesian Data Analysis" by Kruschke. The instruction is really clear and there are code examples, and a lot of the mainstays of NHST are given a Bayesian analogue, so that should have some relevance to you.
"Bayesian Data Analysis" by Gelman. This one is more rigorous (notice the obvious lack of puppies on the cover) but also very good.
Free stuff:
"Think Bayes" by our own resident Bayesian apostle, Allen Downey. This book introduces Bayesian stats from a computational perspective, meaning it lays out problems and solves them by writing Python code. Very easy to follow, free, and just a great resource.
Lecture: "Bayesian Statistics Made (As) Simple (As Possible)" again by Prof. Downey. He's a great teacher.
I'm not a design expert, but I do know that just because Nate uses Excel himself doesn't mean that he's the guy generating these plots. I'm fairly certain that most of the journalists putting these together are using ggplot from R or python.
If you're interested in exact replicas, your language can do 80% of the heavy lifting by giving you the bones of the structure. But to really bring it home, you need a program like Inkspace or Illustrator to polish these up.
I don't think there's any language now that effectively uses good design sensibilities. This is discussed a bit in the book Visualize This by Nathan Yau.
For most people, it looks like the python / R tutorials listed here should get the job done.
edit: a word
If it helps, here are some free books to go through:
Linear Algebra Done Wrong
Paul's Online Math Notes (fantastic for Calc 1, 2, and 3)
Basic Analysis
Basic Analysis is pretty basic, so I'd recommend going through Rudin's book afterwards, as it's generally considered to be among the best analysis books ever written. If the price tag is too high, you can get the same book much cheaper, although with crappier paper and softcover via methods of questionable legality. Also because Rudin is so popular, you can find solutions online.
If you want something better than online notes for univariate Calculus, get Spivak's Calculus, as it'll walk you through single-variable Calculus using more theory than a standard math class. If you're able to get through that and Rudin, you should be good to go once you get good at linear algebra.
Did any of your calc classes include multivariate/vector calculus? E.g. things dealing with double and triple integrals.
If not, take another calc class or two; calculus is very important for statistics. It shouldn't be too hard to pick up the rest of the necessary calc since you've already got a good calc background.
If so, start taking probability and statistics courses in your school's math department if you can. The mathematical way (read: the right way) of understanding probability and statistics is based on probability distributions (like the normal distribution), defined by their probability functions. As such, you can use calculus to obtain a myriad of information from them! For instance, among many other things, within the first one or 2 courses, you'd likely be able to answer at least the Spearman's coefficient question, the Bernoulli process question, and the MLE question.
If you don't have room in your schedule to do the stats course, you could get a textbook and try learning on your own. There are tons of excellent resources. Hogg, Tanis, and Zimmerman is pretty good for an introduction, though I'm sure there's better out there.
Anything by Tufte and the Flowing Data book and blog are great starting places. Tufte is more theory driven, for lack of a better term, while the Flowing Data sources have more "worked" examples (with R, Python, etc).
It would be worth learning ggplot2 as well if you are interested in data visualization as that seems to be the current "standard" tool. Hadley Wickham's website and UseR book on ggplot2 are great places to start.
Relatedly, Wickham's PhD thesis is all about tools and strategies for data visualization and can be found for free on his website. There is also an hour long seminar and slides to go with the paper.
If you take a look at the cover of this book you will see an ellipse (you can imagine this as a point cloud) and two lines running through the ellipse- a solid line, and a dotted line.
By eye, you may think the dotted line seems to cut through the ellipse the best, but the solid line is actually the regression line.
Imagine that you have an x-value, and you want to predict the corresponding y-value. The solid line is the best for this prediction. If you draw a vertical line anywhere on the graph, (fixing x), you will see that if you consider the intersection of the x-line with the ellipse, half of the intersection is above the solid line, and half below. The dotted line here does not fit as well. At the x-extrema of the ellipse, drawing a vertical line will place most of the intersection above or below the dotted line.
The assumptions here is that your x value has no error, and the whole shape of the ellipse, or the variation in y, comes from noise.
If you repeat the whole thing, but instead fix y, and draw horizontal lines, and consider the intersection with the ellipse, you are now attempting to predict an x from a fixed y. Now, the solid line is abysmally bad, but the dotted line is ok, but not the best possible line.
The dotted line is the major axis regression, and it is the line that both predicts x best from y, and y best from x.
Here's my favorite general, theoretical intro to Bayesian stats, by the author of the logic of science book above. Interesting to read and not too long-
http://bayes.wustl.edu/etj/articles/general.background.pdf
More...This one tries to re-teach stats from square one. It's alright, but stops short of Markov Chain Monte Carlo, which is where things get fun.
http://www.amazon.co.uk/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158/ref=sr_1_1?ie=UTF8&s=books&qid=1280142090&sr=1-1
This is the one I'm reading now, which explains bayes for people in the social sciences, and makes an effort to break down the cool stuff into simple terms. I really like the writer and its good so far-
http://www.amazon.co.uk/Bayesian-Methods-Behavioral-Sciences-Statistics/dp/1584885629/ref=sr_1_2?ie=UTF8&s=books&qid=1280142179&sr=1-2
I don't understand the question. Isn't this easy? Just follow the KISS principle (keep it simple, stupid). They're presumably seen histograms somewhere. Just present the kernel density estimate, presumably with optimal bandwidth chosen by cross-validation or something, which is way too complicated to explain, as a better competitor to the histogram. Explain the kernel density estimate as a better estimate of the "theoretical histogram" (I get this terminology from Freedman, Pisani, and Purves, an excellent "statistics for poets" book), which is what the histogram would be if you had an infinite amount of data. No one believes the theoretical histogram actually has jumps like a histogram (estimate), so why not use a smooth estimate like the kernel density estimate? That's almost ELI5.
One of the standard recommendations for someone with a decent math background is All of Statistics by Wasserman. I personally found the style to be lacking on the pedagogical side in that there's next to no hand-holding when it comes to the exercises, but maybe you'll like it. The nice thing about it is that it covers much more than your usual "here's Bayes' theorem and a few things about sampling" book: bootstrapping, parametric inference, decision theory, causal inference, graphical models, some simulation methods, etc.
As for what next, it's hard to recommend anything without knowing exactly what you're interested in (biology is a pretty large field...).
Thanks! FWIW, I just ordered two books on the subject matter, All of Statistics: A Concise Course in Statistical Inference and Detection Theory
Also along with a third addition I just spent over $200 on books, ><, but they seem to have great reviews.
-----------------------------------
So let me tell you one of my biggest confusions from this post. Highlights are mine.
Ok so to keep things simple, lets just focus on one case, on one line. So, I dont get how
[; R(H0 | X) = \lambda_{01} P(H_1 | X) ;]
Questions:
-------------------------------
Ill stop here for now so it doesnt get too complicated...
Thanks!
Good to know! As far as a good book goes, depends on what sort of level you are looking for. This book looks like an interesting sort of introand seems to be well-reviewed , http://www.amazon.com/Naked-Statistics-Stripping-Dread-Data/dp/039334777X/ref=sr_1_2?ie=UTF8&amp;qid=1453406226&amp;sr=8-2&amp;keywords=statistics , although I haven't actually read it.
Statistics is a really useful subject!
This might get poo pood, but I really like some of the schaums outline books.
https://www.amazon.com/Schaums-Outline-Probability-Statistics-4th/dp/007179557X
Why? They are packed full of sample problems and answers, and they tend to provide really concise definitions. I think one of the better ways to understand conditional probability is to see it applied to a range of clear examples.
Also, these books are ridiculously cheap. Tiny investment to make on the off chance you don’t love the format.
I still use this book to quickly brush up on specific concepts at least once a year.
This is super helpful, thank you!
And nothing against simulation, I know it's a powerful tool. I just don't want my foundations built on sand (I'm familiar with intro stats already).
Would Rubin's book on Real Analysis suffice: http://www.amazon.com/Principles-Mathematical-Analysis-International-Mathematics/dp/007054235X
Or are there even more advanced texts to pursue for Real Analysis?
I dove into this stuff almost two years ago with very little preparation or background. Now I'm in an MS program for Applied Statistics, and doing quite well. Here are some tips that worked for me:
Good luck.
This imo is a good book for basic probability and mathematical statistics. Super easy read with a lot of examples. [You also mentioned pdf's for books and someone told you library gensis. I can promise this one is on there :)]
Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Cohen, Cohen, West, and Aiken and Using Multivariate Statistics by Tabachnik and Fidell are both good for your situation, I think. They are easy to read, touch on a wide variety of popular methods, and have lots of examples with code and data from popular software (including SPSS).
Tfw I'm the most knowledgeable person about statistics I know and I have read 0 of these books. Time to get reading! Although I still want to go with Doing Bayesian Data Analysis: A Tutorial with R and BUGS over Gelman et al because I want to do all the work in R. The book itself has 51 reviews on Amazon, 44 of which are 5 stars, for a mean of 4.8. That seems very good.
Saved this thead for future reference. :)
I think this book by Ross is the standard advanced undergraduate text that gives a nice introduction to the subject. In my school it was the text used for a probability 2 course, and is also pretty well known around actuary circles. Its not a bad read for self study and I think the material is decent. It is expensive, but I see this book everywhere so it shouldn't be difficult for you to find a cheap copy. If anyone has a better introduction, do tell.
I would say: Go for it as long as you are interested in the job :)
For study references for remembering R and Statistics, I think all you would need would be:
For R, data cleaning and the such: http://r4ds.had.co.nz/ and for basic statistics with R probably either Daalgard for Applied Statistics with R and something like OpenIntroStats or Freedman for review of stats
If you want a math book with that perspective, I'd recommend E.T. Jaynes "Probability Theory: The Logic of Science" he devolves into quite a lot of discussions about that topic.
If you want a popular science book on the subject, try "The Theory That Would Not Die".
Bayesian statistics has, in my opinion, been the force that has attempted to reverse this particular historical trend. However, that viewpoint is unlikely to be shared by all in this area. So take my viewpoint with a grain of salt.
This is the book that my Bayes course uses. I don't know if it's any good - I take it this fall as well and haven't ordered the book yet, but I'm hoping there is at least some good reason why it was chosen (and no, the author isn't my professor): https://www.amazon.com/Bayesian-Statistical-Methods-Springer-Statistics/dp/0387922997
Along the lines of probability, I recommend The Art of Probability. I also like the Schaum's Outlines of Probability and Statistics. If you want something more mathematical (calculus-based), All of Statistics by Wasserman is a solid reference.
There is no causality in a linear model and statistics of regression don’t involve causality whatsoever.
You can find lengthy discussion of this in OLS textbooks by David Freedman
Check out this book: https://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850
It offers pretty straightforward explanations of the questions you are asking, with nice proofs of the results for basic OLS, and is very explicit about which assumptions are needed for which results.
Great question. Start by understanding the different ways that data can be manipulated:
And then, as /medquien pointed out, investigate the data source. Especially if it's reported in the media but based on a research paper or - as we're seeing these days - an election poll. You can learn a lot by reading the methodology behind the polls.
I actually co-authored a book - Everydata - about some of the (many) ways that people manipulate and misinterpret data. For example, the claim that 4 out of 5 pediatricians recommend Gerber baby food was based on cherry picked data (only 12% of the pediatricians surveyed actually recommended Gerber). If you look at a map that uses the Mercator projection (you've probably seen one in classrooms), Africa looks the same size as Greenland - when in reality it's 14x larger. Shameless plug: if you like reading about these types of things, you can check out my co-author's blog, follow @Everydata on Twitter, or come see us speak at TEDxBuffalo.
Fantastic
EDIT: Added to my wishlist on Amazon: http://www.amazon.com/Using-Multivariate-Statistics-5th-Edition/dp/0205459382/ref=tmm_pap_title_0
Honestly, ignore the "for engineering" part of "Statistics for Engineering." They're largely the same content.
How much calculus have you taken? Does the class use calculus?
First, the cartoon guide to statistics is surprisingly helpful for some people.
For a more traditional textbook, you might try Devore's main intro book.
Almost every student finds statistics confusing and it's either difficult to teach, or just difficult to learn. It's also a fractal discipline, since you can keep going deeper and deeper, but it's generally just going over the same few concepts with additional depth. If you end up in a class that's not well suited to your mathematical background it's especially frustrating.
Good luck.
The R Book by Crawley is great for this. Send me a pm if you need to find it on the interweb.
PDF WARN: Introduction to Math Stat by Hogg
Not to be confused with Probability and Math Stat by Tannis and Hogg which is a "first semester" course.
Good blend of theory and "talky-ness", good exercises that test your understanding, most should be do-able from just applying the basics.
I have used the following two books:
www.amazon.com/gp/aw/d/0470141158?pc_redir=1411138170&robot_redir=1
http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=Brml.HomePage?from=Main.Textbook
Good read on the topic and history of the rise of Bayesian Statistics here:
http://www.amazon.com/The-Theory-That-Would-Not/dp/0300188226
>https://www.amazon.com/Bayesian-Statistical-Methods-Springer-Statistics/dp/0387922997
Is rethinking good for a beginner?
Khan Academy, free.
If you want problems and answers, I highly recommend the Schaums guides. You’ll need to pick the right one for her level, but basically there are a lot of problems and answers to help understand the issues.
https://www.amazon.ca/Schaums-Outline-Probability-Statistics-4th/dp/007179557X
This is a really good book on Bayesian statistics, but Kruschke is coming out with a new edition in about two months with completely different code. It's going to use JAGS and STAN instead of BUGS.
Schaum's Outline of Probability and Statistics is a good review with lots of practice problems. Check out the videos on Khan academy too, they really helped me with some of the concepts.
Naked Statistics
>https://www.amazon.com/Everydata-Misinformation-Hidden-Little-Consume/dp/1629561010
Elements for beginners? hmm
The Theory that Would Not Die
Well I'll be damned.
Ballsy title.
For undergrad probability, Pitman's book or Ross's two books here and here.
For graduate probability, Billingsley (h/t /u/DCI_John_Luther), Williams or Durrett.
Bayesian Data Analysis and Hoff are both well-respected. The first is a much bigger book with lots of applications, the latter is more of an introduction to the theory and methods.
The book All of Statistics gives a broad but (relatively) quick introduction to modern statistics.
How would you compare this with Freedman's Statistical Models? (https://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850)
Perhaps Tabachnick & Fidell would meet your needs
https://www.amazon.com/Probability-Statistics-Engineering-Sciences-Devore/dp/0538733527
https://www.amazon.com/Everydata-Misinformation-Hidden-Little-Consume/dp/1629561010
A very casual book. No math formulas at all.
Recommendation - don't learn statistics through "statistics for biology/ecology".
Go straight to statistics texts, the applied ones aren't that hard and they usually have fewer of the lost-in-translation errors (e.g. the abuse of p-values in all of biology).
Try Gelman and Hill -
http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/052168689X/ref=sr_1_1?ie=UTF8&amp;qid=1427768688&amp;sr=8-1&amp;keywords=gelman+hill
Faraway - Practical Regression and Anova using (free)
http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
Categorical data analysis
http://www.amazon.com/Categorical-Data-Analysis-Alan-Agresti/dp/0470463635/ref=sr_1_1?ie=UTF8&amp;qid=1427768746&amp;sr=8-1&amp;keywords=categorical+data+analysis