(Part 2) Top products from r/statistics

Jump to the top 20

We found 63 product mentions on r/statistics. We ranked the 465 resulting products by number of redditors who mentioned them. Here are the products ranked 21-40. You can also go back to the previous section.

Next page

Top comments that mention products on r/statistics:

u/efrique · 2 pointsr/statistics

> she didn't know how useful it would be

probably more employable than geography

> Do you guys have any recommendations on where to develop my knowledge and skills?

There's a bunch of free and inexpensive stuff around .. but there's also a lot of bad free/inexpensive stuff around; you have to be a bit discerning (which is hard when you're trying to learn it).

It might sound a bit old-school but I'd suggest going to a university library and finding some decent stats texts; you probably want to avoid the stuff that says "11th edition".

Find several that you like and work with those for a while

Some you might look for:

Statistics, Freedman, Pisani & Purves (any edition)

Introduction to the Practice of Statistics, Moore & McCabe (5th edition or earlier)

For a bit of theory (you'll need a bit of mathematics for this but not a ton of it):

Introduction To The Theory Of Statistics, Mood Graybill And Boes

These are all old books. You should be able to get them second hand for cheap, or read them in a library. They'll be a good grounding, but you'll need to be able to ask questions as well.

Places like this one and stats.stackexchange.com can be handy resources. I've seen determined people teach themselves a lot of statistics with only a bit of guidance so it can certainly be done.

> Do I need programming, if so, what would be the best programming language to learn?

It would be best to learn some, yes, because modern statistics relies on it heavily. You don't necessarily have to do it immediately but getting an early start (and using it to help with learning stats) will be better than leaving it really long.

Two main things are widely used ... R and Python. Both are free. The second is more of a mainstream programming language, the first is a statistics package as well as a more specialized language

Learn one or both; my own suggestion would be to try R but other people may have different advice.

If you want to be a programmer rather than a statistician who uses code to solve statistical problems, python would be the better choice.


u/COOLSerdash · 9 pointsr/statistics
u/[deleted] · 10 pointsr/statistics

Books:

"Doing Bayesian Data Analysis" by Kruschke. The instruction is really clear and there are code examples, and a lot of the mainstays of NHST are given a Bayesian analogue, so that should have some relevance to you.

"Bayesian Data Analysis" by Gelman. This one is more rigorous (notice the obvious lack of puppies on the cover) but also very good.

Free stuff:

"Think Bayes" by our own resident Bayesian apostle, Allen Downey. This book introduces Bayesian stats from a computational perspective, meaning it lays out problems and solves them by writing Python code. Very easy to follow, free, and just a great resource.

Lecture: "Bayesian Statistics Made (As) Simple (As Possible)" again by Prof. Downey. He's a great teacher.

u/froggyenterprisesltd · 7 pointsr/statistics

I'm not a design expert, but I do know that just because Nate uses Excel himself doesn't mean that he's the guy generating these plots. I'm fairly certain that most of the journalists putting these together are using ggplot from R or python.

If you're interested in exact replicas, your language can do 80% of the heavy lifting by giving you the bones of the structure. But to really bring it home, you need a program like Inkspace or Illustrator to polish these up.

I don't think there's any language now that effectively uses good design sensibilities. This is discussed a bit in the book Visualize This by Nathan Yau.

For most people, it looks like the python / R tutorials listed here should get the job done.

edit: a word

u/beaverteeth92 · 3 pointsr/statistics

If it helps, here are some free books to go through:

Linear Algebra Done Wrong

Paul's Online Math Notes (fantastic for Calc 1, 2, and 3)

Basic Analysis


Basic Analysis is pretty basic, so I'd recommend going through Rudin's book afterwards, as it's generally considered to be among the best analysis books ever written. If the price tag is too high, you can get the same book much cheaper, although with crappier paper and softcover via methods of questionable legality. Also because Rudin is so popular, you can find solutions online.

If you want something better than online notes for univariate Calculus, get Spivak's Calculus, as it'll walk you through single-variable Calculus using more theory than a standard math class. If you're able to get through that and Rudin, you should be good to go once you get good at linear algebra.

u/mathnstats · 2 pointsr/statistics

Did any of your calc classes include multivariate/vector calculus? E.g. things dealing with double and triple integrals.

If not, take another calc class or two; calculus is very important for statistics. It shouldn't be too hard to pick up the rest of the necessary calc since you've already got a good calc background.

If so, start taking probability and statistics courses in your school's math department if you can. The mathematical way (read: the right way) of understanding probability and statistics is based on probability distributions (like the normal distribution), defined by their probability functions. As such, you can use calculus to obtain a myriad of information from them! For instance, among many other things, within the first one or 2 courses, you'd likely be able to answer at least the Spearman's coefficient question, the Bernoulli process question, and the MLE question.

If you don't have room in your schedule to do the stats course, you could get a textbook and try learning on your own. There are tons of excellent resources. Hogg, Tanis, and Zimmerman is pretty good for an introduction, though I'm sure there's better out there.

u/iacobus42 · 4 pointsr/statistics

Anything by Tufte and the Flowing Data book and blog are great starting places. Tufte is more theory driven, for lack of a better term, while the Flowing Data sources have more "worked" examples (with R, Python, etc).

It would be worth learning ggplot2 as well if you are interested in data visualization as that seems to be the current "standard" tool. Hadley Wickham's website and UseR book on ggplot2 are great places to start.

Relatedly, Wickham's PhD thesis is all about tools and strategies for data visualization and can be found for free on his website. There is also an hour long seminar and slides to go with the paper.

u/maxwell_smart_jr · 2 pointsr/statistics

If you take a look at the cover of this book you will see an ellipse (you can imagine this as a point cloud) and two lines running through the ellipse- a solid line, and a dotted line.

By eye, you may think the dotted line seems to cut through the ellipse the best, but the solid line is actually the regression line.

Imagine that you have an x-value, and you want to predict the corresponding y-value. The solid line is the best for this prediction. If you draw a vertical line anywhere on the graph, (fixing x), you will see that if you consider the intersection of the x-line with the ellipse, half of the intersection is above the solid line, and half below. The dotted line here does not fit as well. At the x-extrema of the ellipse, drawing a vertical line will place most of the intersection above or below the dotted line.

The assumptions here is that your x value has no error, and the whole shape of the ellipse, or the variation in y, comes from noise.

If you repeat the whole thing, but instead fix y, and draw horizontal lines, and consider the intersection with the ellipse, you are now attempting to predict an x from a fixed y. Now, the solid line is abysmally bad, but the dotted line is ok, but not the best possible line.

The dotted line is the major axis regression, and it is the line that both predicts x best from y, and y best from x.

u/jjrs · 4 pointsr/statistics

Here's my favorite general, theoretical intro to Bayesian stats, by the author of the logic of science book above. Interesting to read and not too long-
http://bayes.wustl.edu/etj/articles/general.background.pdf

More...This one tries to re-teach stats from square one. It's alright, but stops short of Markov Chain Monte Carlo, which is where things get fun.
http://www.amazon.co.uk/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158/ref=sr_1_1?ie=UTF8&s=books&qid=1280142090&sr=1-1

This is the one I'm reading now, which explains bayes for people in the social sciences, and makes an effort to break down the cool stuff into simple terms. I really like the writer and its good so far-
http://www.amazon.co.uk/Bayesian-Methods-Behavioral-Sciences-Statistics/dp/1584885629/ref=sr_1_2?ie=UTF8&s=books&qid=1280142179&sr=1-2

u/berf · 1 pointr/statistics

I don't understand the question. Isn't this easy? Just follow the KISS principle (keep it simple, stupid). They're presumably seen histograms somewhere. Just present the kernel density estimate, presumably with optimal bandwidth chosen by cross-validation or something, which is way too complicated to explain, as a better competitor to the histogram. Explain the kernel density estimate as a better estimate of the "theoretical histogram" (I get this terminology from Freedman, Pisani, and Purves, an excellent "statistics for poets" book), which is what the histogram would be if you had an infinite amount of data. No one believes the theoretical histogram actually has jumps like a histogram (estimate), so why not use a smooth estimate like the kernel density estimate? That's almost ELI5.

u/lewat · 3 pointsr/statistics

One of the standard recommendations for someone with a decent math background is All of Statistics by Wasserman. I personally found the style to be lacking on the pedagogical side in that there's next to no hand-holding when it comes to the exercises, but maybe you'll like it. The nice thing about it is that it covers much more than your usual "here's Bayes' theorem and a few things about sampling" book: bootstrapping, parametric inference, decision theory, causal inference, graphical models, some simulation methods, etc.

As for what next, it's hard to recommend anything without knowing exactly what you're interested in (biology is a pretty large field...).

u/Ayakalam · 1 pointr/statistics

Thanks! FWIW, I just ordered two books on the subject matter, All of Statistics: A Concise Course in Statistical Inference and Detection Theory

Also along with a third addition I just spent over $200 on books, ><, but they seem to have great reviews.

-----------------------------------

So let me tell you one of my biggest confusions from this post. Highlights are mine.

Ok so to keep things simple, lets just focus on one case, on one line. So, I dont get how
[; R(H0 | X) = \lambda_{01} P(H_1 | X) ;]

Questions:

  • What is [; R(H_0 | X) ;]? Is it just a number?

  • He says that [; \lambda_{01};] is the 'cost of accepting H0, when in fact H1 was true'. Fine, that makes sense.

  • So why isnt [; R(H0 | X) ;] not just [; \lambda_{01};]? I dont get this. What is the conceptual difference between 'cost of picking H0' and 'risk by picking H0' here? Neo gives me a blue pill or red pill. The cost to of picking the wrong one is I die in one. So what is my risk then? I need an example for this...

  • [; P(H1 | X) ;] is the probability of accepting H1 given what you observed, X. First off, I do not know what that means. "The probability of accepting H1 given X". What does that even mean? To me this is nonsense. I am the one making the decision. How can you place a probability on it? Are they saying that if you show me 1000 cases of X, and I say "H1" 20% of the time, then [; P(H1 | X) = 0.2;] ? If not, then I am totally lost on the meaning of this.

    -------------------------------

    Ill stop here for now so it doesnt get too complicated...

    Thanks!
u/bbbeans · 1 pointr/statistics

Good to know! As far as a good book goes, depends on what sort of level you are looking for. This book looks like an interesting sort of introand seems to be well-reviewed , http://www.amazon.com/Naked-Statistics-Stripping-Dread-Data/dp/039334777X/ref=sr_1_2?ie=UTF8&qid=1453406226&sr=8-2&keywords=statistics , although I haven't actually read it.

Statistics is a really useful subject!

u/Jb112358 · 2 pointsr/statistics

This might get poo pood, but I really like some of the schaums outline books.

https://www.amazon.com/Schaums-Outline-Probability-Statistics-4th/dp/007179557X

Why? They are packed full of sample problems and answers, and they tend to provide really concise definitions. I think one of the better ways to understand conditional probability is to see it applied to a range of clear examples.

Also, these books are ridiculously cheap. Tiny investment to make on the off chance you don’t love the format.

I still use this book to quickly brush up on specific concepts at least once a year.

u/charlesbukowksi · 1 pointr/statistics

This is super helpful, thank you!

And nothing against simulation, I know it's a powerful tool. I just don't want my foundations built on sand (I'm familiar with intro stats already).

Would Rubin's book on Real Analysis suffice: http://www.amazon.com/Principles-Mathematical-Analysis-International-Mathematics/dp/007054235X

Or are there even more advanced texts to pursue for Real Analysis?

u/vmsmith · 3 pointsr/statistics

I dove into this stuff almost two years ago with very little preparation or background. Now I'm in an MS program for Applied Statistics, and doing quite well. Here are some tips that worked for me:

  • If you don't have time to back up and regroup, check out Khan Academy, and this guy's YouTube videos. These can help with specific concepts.

  • If you have time to back up and regroup, check out Coursera, Udacity, EdX, and the other MOOCs. Coursera in particular has some very good courses dealing with statistics.

  • Take a look at Statistics for Dummies and Naked Statistics.

  • Use Reddit and StackOverflow. But use them wisely, and only after you've exhausted other means.

    Good luck.
u/Bomb3213 · 1 pointr/statistics

This imo is a good book for basic probability and mathematical statistics. Super easy read with a lot of examples. [You also mentioned pdf's for books and someone told you library gensis. I can promise this one is on there :)]

u/gpark · 1 pointr/statistics

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Cohen, Cohen, West, and Aiken and Using Multivariate Statistics by Tabachnik and Fidell are both good for your situation, I think. They are easy to read, touch on a wide variety of popular methods, and have lots of examples with code and data from popular software (including SPSS).

u/Deleetdk · 4 pointsr/statistics

Tfw I'm the most knowledgeable person about statistics I know and I have read 0 of these books. Time to get reading! Although I still want to go with Doing Bayesian Data Analysis: A Tutorial with R and BUGS over Gelman et al because I want to do all the work in R. The book itself has 51 reviews on Amazon, 44 of which are 5 stars, for a mean of 4.8. That seems very good.

Saved this thead for future reference. :)

u/umib0zu · 6 pointsr/statistics

I think this book by Ross is the standard advanced undergraduate text that gives a nice introduction to the subject. In my school it was the text used for a probability 2 course, and is also pretty well known around actuary circles. Its not a bad read for self study and I think the material is decent. It is expensive, but I see this book everywhere so it shouldn't be difficult for you to find a cheap copy. If anyone has a better introduction, do tell.

u/Sarcuss · 6 pointsr/statistics

I would say: Go for it as long as you are interested in the job :)

For study references for remembering R and Statistics, I think all you would need would be:

For R, data cleaning and the such: http://r4ds.had.co.nz/ and for basic statistics with R probably either Daalgard for Applied Statistics with R and something like OpenIntroStats or Freedman for review of stats

u/mrdevlar · 2 pointsr/statistics

If you want a math book with that perspective, I'd recommend E.T. Jaynes "Probability Theory: The Logic of Science" he devolves into quite a lot of discussions about that topic.

If you want a popular science book on the subject, try "The Theory That Would Not Die".

Bayesian statistics has, in my opinion, been the force that has attempted to reverse this particular historical trend. However, that viewpoint is unlikely to be shared by all in this area. So take my viewpoint with a grain of salt.

u/Kalrog · 3 pointsr/statistics

This is the book that my Bayes course uses. I don't know if it's any good - I take it this fall as well and haven't ordered the book yet, but I'm hoping there is at least some good reason why it was chosen (and no, the author isn't my professor): https://www.amazon.com/Bayesian-Statistical-Methods-Springer-Statistics/dp/0387922997

u/navyjeff · 2 pointsr/statistics

Along the lines of probability, I recommend The Art of Probability. I also like the Schaum's Outlines of Probability and Statistics. If you want something more mathematical (calculus-based), All of Statistics by Wasserman is a solid reference.

u/merkaba8 · 2 pointsr/statistics

There is no causality in a linear model and statistics of regression don’t involve causality whatsoever.

You can find lengthy discussion of this in OLS textbooks by David Freedman

Check out this book: https://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850

It offers pretty straightforward explanations of the questions you are asking, with nice proofs of the results for basic OLS, and is very explicit about which assumptions are needed for which results.

u/MikeGluck · 5 pointsr/statistics

Great question. Start by understanding the different ways that data can be manipulated:

  • Cherry picking based on time ranges, etc.
  • Using a sample that doesn't represent the full population
  • Manipulating the scales / axes on charts
  • Implying causation when the only proven fact is correlation (huge telltale sign)

    And then, as /medquien pointed out, investigate the data source. Especially if it's reported in the media but based on a research paper or - as we're seeing these days - an election poll. You can learn a lot by reading the methodology behind the polls.

    I actually co-authored a book - Everydata - about some of the (many) ways that people manipulate and misinterpret data. For example, the claim that 4 out of 5 pediatricians recommend Gerber baby food was based on cherry picked data (only 12% of the pediatricians surveyed actually recommended Gerber). If you look at a map that uses the Mercator projection (you've probably seen one in classrooms), Africa looks the same size as Greenland - when in reality it's 14x larger. Shameless plug: if you like reading about these types of things, you can check out my co-author's blog, follow @Everydata on Twitter, or come see us speak at TEDxBuffalo.
u/clm100 · 2 pointsr/statistics

Honestly, ignore the "for engineering" part of "Statistics for Engineering." They're largely the same content.

How much calculus have you taken? Does the class use calculus?

First, the cartoon guide to statistics is surprisingly helpful for some people.

For a more traditional textbook, you might try Devore's main intro book.

Almost every student finds statistics confusing and it's either difficult to teach, or just difficult to learn. It's also a fractal discipline, since you can keep going deeper and deeper, but it's generally just going over the same few concepts with additional depth. If you end up in a class that's not well suited to your mathematical background it's especially frustrating.

Good luck.

u/topheroly · 1 pointr/statistics

The R Book by Crawley is great for this. Send me a pm if you need to find it on the interweb.

u/NegativeNail · 1 pointr/statistics

PDF WARN: Introduction to Math Stat by Hogg

Not to be confused with Probability and Math Stat by Tannis and Hogg which is a "first semester" course.

Good blend of theory and "talky-ness", good exercises that test your understanding, most should be do-able from just applying the basics.

u/yarasa · 1 pointr/statistics

I have used the following two books:

  1. Good introduction, with a discussion of frequentist vs Bayesian statistics:

    www.amazon.com/gp/aw/d/0470141158?pc_redir=1411138170&robot_redir=1

  2. PDF available online, more machine learning oriented:

    http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=Brml.HomePage?from=Main.Textbook
u/AndersonCoopersDick · 1 pointr/statistics

Good read on the topic and history of the rise of Bayesian Statistics here:

http://www.amazon.com/The-Theory-That-Would-Not/dp/0300188226

u/datascigeek · 3 pointsr/statistics

Khan Academy, free.

If you want problems and answers, I highly recommend the Schaums guides. You’ll need to pick the right one for her level, but basically there are a lot of problems and answers to help understand the issues.
https://www.amazon.ca/Schaums-Outline-Probability-Statistics-4th/dp/007179557X

u/coffeecoffeecoffeee · 4 pointsr/statistics

This is a really good book on Bayesian statistics, but Kruschke is coming out with a new edition in about two months with completely different code. It's going to use JAGS and STAN instead of BUGS.

u/gatordan · 2 pointsr/statistics

Schaum's Outline of Probability and Statistics is a good review with lots of practice problems. Check out the videos on Khan academy too, they really helped me with some of the concepts.

u/4ngry4vian · 8 pointsr/statistics

For undergrad probability, Pitman's book or Ross's two books here and here.

For graduate probability, Billingsley (h/t /u/DCI_John_Luther), Williams or Durrett.

u/CrazyStatistician · 10 pointsr/statistics

Bayesian Data Analysis and Hoff are both well-respected. The first is a much bigger book with lots of applications, the latter is more of an introduction to the theory and methods.

u/blind_swordsman · 0 pointsr/statistics

The book All of Statistics gives a broad but (relatively) quick introduction to modern statistics.

u/klaxion · 5 pointsr/statistics

Recommendation - don't learn statistics through "statistics for biology/ecology".

Go straight to statistics texts, the applied ones aren't that hard and they usually have fewer of the lost-in-translation errors (e.g. the abuse of p-values in all of biology).

Try Gelman and Hill -

http://www.amazon.com/Analysis-Regression-Multilevel-Hierarchical-Models/dp/052168689X/ref=sr_1_1?ie=UTF8&qid=1427768688&sr=8-1&keywords=gelman+hill

Faraway - Practical Regression and Anova using (free)

http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf

Categorical data analysis

http://www.amazon.com/Categorical-Data-Analysis-Alan-Agresti/dp/0470463635/ref=sr_1_1?ie=UTF8&qid=1427768746&sr=8-1&keywords=categorical+data+analysis