(Part 2) Best statistical software books according to redditors

Jump to the top 20

We found 115 Reddit comments discussing the best statistical software books. We ranked the 54 resulting products by number of redditors who mentioned them. Here are the products ranked 21-40. You can also go back to the previous section.

Next page

Top Reddit comments about Mathematical & Statistical Software:

u/CapaneusPrime · 14 pointsr/RStudio

Super minor nitpick:

R Studio is the development environment.

R is the language.

Presumably you want to become well versed in the latter rather than the former. It's an easy mistake to make though, since the two are so intertwined for most people as to become almost indistinguishable.

More to your point though:

Before learning anything, it's a good idea to ask yourself why you want to learn it, and what you hope to be able to do with it. Now, you mentioned two things,

  • Hypothesis testing.

  • Graphing 4 variables.

    Both of these are relatively simple, and if you have even the most rudimentary understanding of R, you could learn to do in a couple of minutes.

    So, my question to you would be, in using R is your goal to get quick, simple answers to straightforward questions OR are you ultimately looking to be able to do much more complicated tasks? This isn't a judgemental question, not everyone needs to aspire to become an R god, just needing something quick and dirty is perfectly okay.

    If the things you mentioned are more or less the extent of your needs, I'd suggest just googling what you need to do at the time and pick up what you need, more or less, through osmosis.

    However, if you have designs on being able to do amazingly complicated things, if you want to push R to its fullest, you'll need a more structured approach.

    One thing you absolutely must understand is R is a package based language. What this means for you is that beyond the numerous ways you can do any task in any language, people have written countless* packages which contain all sorts of handy functions to do just about anything you could conceivably want to do.

    >* Okay, it's not really countless, there are (as of this writing 12,620 packages on CRAN and 1,560 additional packages on bioconductor. There are bunches more of unofficial ones scattered about GitHub and others privately maintained, but you get the point, there's lots of them.

    So, for anything you want to do, you can approach it in one of two, very broad, ways:

  • Base R.

  • Using packages.

    When you are starting out, I think it's very important to get a good handle on Base R.

    I would start out with basically any introductory R book. Search on Amazon and just find one you like.

    Personally, I can recommend Using R for Introductory Statistics by John Verzani. It isn't for everyone, but if you're truly a beginner to both R and statistics more generally, it's a good reference text.

    After that it's, up to you. Where you want to take it. For me, the pantheon of R gods* I would pay tribute to are these four:

  • The god of tidiness - Hadley Wickham GitHub/u/hadley

  • The god of speed - Dirk Eddelbuettel GitHub

  • The god of art - Winston Chang GitHub

  • The god of sharing - Yihui Xie GitHub

    >*I'm sure every single person on that list would balk at being called a "god," but they'd be lying.

    It's no mistake that 3/4 of them work for R Studio.

    The god of tidiness.


    Hadley must be a complete neat-freak because he's the driving force behind the tidyverse,

    >The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.

    Once you branch out of base R, the tidyverse should be your first destination. It's not quite a new language unto itself, more like a very sophisticated dialect of the language you already know. Once you can speak "tidy," you can still communicate with the "base" speaking plebs, you just won't be able to imagine every wanting to.*
    >* this is not exactly true, and might come across as gross and elitist, but the tidy paradigm really is substantially better. If you were designing a completely new language to do statistical competing, from scratch, today, the language would probably feel a lot like the tidyverse.

    Anyway, any book by Hadley Wickham is gold, and they're all available online for free. But R for Data Science is a good first step into a larger world.

    The god of speed.


    I imagine Dirk is not a patient man. He's very active on forums, basically every meaningful response on stackexchange for an Rcpp related question is his (or his collaborator, lesser-god Romain Francois), but sometimes his responses can seem a little... terse?

    Now, R is notoriously slow. It's much maligned for this, usually fairly, sometimes not.

    Much of the perceived slowness can be mitigated in base R by learning the suite of apply functions which are vectorized. That is they take a multivalued variable (a vector, matrix, or list) and they apply the same function to each element. Its typically much, much faster than using a for-loop. However, you can't always get away from needing a for-loop, and sometimes your loop will need to run thousands (or millions) of times. That's where the Rcpp package which Dirk maintains comes into play.

    It is an interface between R and C++, there's not much to say about the package itself. You'll need to learn at least some rudimentary C++ to make use of it, but simply breaking out a computationally intensive for-loop into an Rcpp function can yield a huge improvement in run times. 10x-100x (or more) depending on how well (or poorly) optimized your R and C++ code is. There's some weirdness involved (like you can't call an Rcpp function in a parallel apply function (separate package) unless your Rcpp function is loaded as part of a package, so for maximum benefit you'll need to learn how to write your own packages - praise be to Hadley).

    Rcpp includes some semantic "sugar" which allows you to write some things in C++ more like you would in R, but that's yet a third thing to learn.

    Also Rcpp, much like the tidyverse is more an ecosystem of interconnected packages than a single package.

    The god of art.


    Base R plots are ugly as sin. They just are, no one should use them ever, for any reason.*

    >*Exaggeration.

    That said, Winston's* ggplot2 is a revelation and a revolution in how graphics are created and presented.

    >* Yes, technically ggplot2 is also Hadley's and is part of the tidyverse, but Winston literally wrote the book on it. Okay, okay, Hadley technically created the package and has written books about it, I just find Chang's book more fitting to my needs.

    The "gg" in ggplot2 stands for "grammer of graphics", a common structure for describing the components of a visualization in a concise way.

    Learning ggplot2 will take you a long way toward being able to make beautiful graphical visualizations.

    The god of sharing.


    After you've learned all of the above. You can wrangle your messy data into something tidy and manageable, you can work on it cleanly and power through massive computations, and you can create stunning images from your data, it all means nothing if you're the only one who sees it.

    This is where Yihui shines. He is the maintainer for the knitr package, and the author of Dynamic Documents with R and knitr. This will allow you to turn all of your work into PDFs or web pages to share with the world.

    It's super easy to get started with, much more complicated to master, but definitely worth it.

    To use it effectively, you'll need to learn rmarkdown also by Yihui. You'll also want to start dabbling with LaTeX (if your not proficient already) and to truly bend documents to your whim you'll need to learn to tinker with YAML.

    Closing remarks.


    It's a lot to master. Few ever will. Not everyone will agree on everything I've said, but I think the park to true mastery looks something like that.

    Best of luck!

u/danmanmoo · 5 pointsr/EngineeringStudents

With an intro book! And the software, of course. It's not a hard language to learn especially if you have C++ experience, since MATLAB is based on C++. There's a bunch of introduction to MATLAB books on Amazon. This book looks like a good place to start. Also, there's /r/matlab where most subscribers post questions they have about the software. Lastly, MIT OpenCourseWare has the notes for an Introduction to MATLAB course online here.

From my experience with learning MATLAB, the syntax and basic commands are fairly easy to understand. Most of the learning comes from figuring out what MATLAB is capable of, how to use its various capabilities, and how to find shortcuts in programming to avoid unnecessary and repetitive lines of code.

u/[deleted] · 3 pointsr/math

I recommend: Ideals, Varieties and Algorithms. Great introduction to the subject and the one I used for a 4th year/1st year graduate level course. It may be a "computational approach" but it definitely does not shy away from the theory and rigour. It would be ideal for a self-study as well.


There is also an extremely beautiful application of algebraic geometry with roots of unity to the k-colouring problem in graph theory. Here is a set of slides introducing it: link

u/StochasticExpress · 3 pointsr/statistics

If you want to learn Machine Learning, as others have suggested Elements of Statistical Learning.

But there are lots of concepts of Statistical Inference that pop-out here and there when you read some Machine Learning algorithms. For example the concepts of MLE (Maximum Likelihood Estimation), EM (Expectation Maximization), What is a sufficient statistic, efficient estimators, Fisher Information Matrix, CR bounds etc.These concepts are widely used in many machine learning topics. It might help you to have a grounding in general Statstical Inference.

However, I found Essential Statistical Inference: Theory and Methods to be a much better book than the book by Casella and Berger.

u/Herdo · 2 pointsr/MensRights

Honestly I am not too sure. I haven't delved deep into the MRM myself and I just stumbled upon that video a couple weeks ago. I then noticed her books mentioned in the sidebar a couple days ago. Of all the mens rights supporters I have come across, she is the only one I have found that focuses on boys in education. Like I said though I have only seen that one video and read a few things about her here on /r/MensRights.

I'm sure some of the older members will chime in soon enough though with more detailed information.

EDIT: I just did a quick search and came across this book which seems promising. What's even more interesting is the dozens of books located down the page under the "Customers who bought this item also bought..." section. I'm not seeing anything UK centric, but some of these books look interesting none the less.

u/beaverteeth92 · 2 pointsr/statistics

If example problems are your thing, A Handbook of Statistical Analyses using SAS is a great book.

Just be aware that SAS is case insensitive and it's highly encouraged that you type keywords in all caps. This book doesn't do that.

u/flight_club · 2 pointsr/math

What is your background?

http://www.amazon.com/Statistical-Inference-George-Casella/dp/0534243126
Is a fairly standard first year grad textbook with I quite enjoy. Gives you a mathematical statistics foundation.

http://www.amazon.com/All-Statistics-Concise-Statistical-Inference/dp/1441923225/ref=sr_1_1?ie=UTF8&s=books&qid=1278495200&sr=1-1
I've heard recommended as an approachable overview.

http://www.amazon.com/Modern-Applied-Statistics-W-Venables/dp/1441930086/ref=sr_1_1?ie=UTF8&s=books&qid=1278495315&sr=1-1
Is a standard 'advanced' applied statistics textbook.

http://www.amazon.com/Weighing-Odds-Course-Probability-Statistics/dp/052100618X
Is non-standard but as a mathematician turned probabilist turned statistician I really enjoyed it.

http://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850/ref=pd_sim_b_1
Is a book which covers classical statistical models. There's an emphasis on checking model assumptions and seeing what happens when they fail.

u/COOLSerdash · 2 pointsr/AskStatistics

Here are some thoughts:

  • How strong an association is cannot be inferred from the magnitude of the p-values. Low p-values just provide some evidence against the hypothesis that these regression coefficients are zero. How strong their influence depends on the actual coefficient and subject matter expertise is needed to properly interpret them.
  • These regression models have underlying assumptions which is why you should inspect some model fits, especially the residuals (residuals vs. fitted, QQ-Plots of the residuals etc.).
  • The R-squared is a bad measure for strength of relationship or predictive power of your model. Relevant post is here.
  • How did you arrive at these 3 parameters in the final model? Model selection is a very delicate topic in statistics and many books have been written solely on this (e.g. Frank Harrell's). Just picking the parameters on the basis of p-values - for example - is generally thought to be a bad idea. This book is free and has a nice chapter (i.e. 6, page 203) on model selection in linear models.
  • Excel is good for many things. Statistics is not among them.

u/donbeleibmejuswatch · 2 pointsr/ECE

As everyone else has said, you should probably get Balanis intro Antenna, Pozar for Microwave and Balanis for the EM Refresher. The TM Line theory (the little needed) is covered pretty well in Chapter 2 of Balanis. Chapter 8's MoM coverage and intro to FTDT/FEM is good but you may want to explore it a bit more. Try out the VlabsAntenna or get a FEM toolbox

I'd also recommend along with the Balanis's CD's content to try:
this and its suggested books

Also if you'd like to get trolled hard as fuck because of the sheer difficulty, I suggest Kraus. You may need a place to crawl into a hole and die afterwards tho.

u/Tupiekit · 1 pointr/AskStatistics

tbh through the class theres so much that I dont fully understand that its hard to pinpoint just what it is that I dont get (I know not very helpful). It is frustrating because I'll read the chapters, and once im done I'll realize I have no idea what I just read and how it connects to every thing else, and re-reading doesnt help either. The only true example I can give is the book that im using, otherwise id be quoting the entire book haha.

https://www.amazon.com/Applied-Multivariate-Statistics-Biology-Health/dp/3319140922/ref=sr_1_1?ie=UTF8&qid=1519082994&sr=8-1&keywords=applied+multivariate+statistics+with+r

u/dizzylynn · 1 pointr/printSF
u/gerserehker · 1 pointr/learnpython

Ah how silly of me, I completely ruled out the part where I rooted the result and then drew it with my compass!

OK, I'm trying to work out what you've posted now... For a lot of these I need to have the axis in the center of the screen rather than the far edges.

Although I just entered what you did and the value isn't really different, it's just that I can't see the origin in the center.

With center

Without center....

Also - Why is it elliptical? Is that some setting or is that the way that's meant to be? Bit confused about that.

I'm still struggling to 'read' it a bit.... I'll try to explain in sentences what's happening (sometimes that helps....)

****

x = np.linspace(-1,1,1001)

This creates an array of values from -1 through to +1, and the 1001 is the amount of steps that are taken between them. The higher the third value, the greater the amount of steps and as a result accuracy of the graph curve.

y_upper = np.sqrt(1.0-x**2)

This creates an array of positive values based on the array x, so in this case there will be 1001 positive values. Assigns to the variable y_upper

y_lower = -y_upper

This creates an array of inverse values to the previous array.

plt.plot( x,y_upper,'r', x,y_lower,'r')

This plots both arrays onto the axis, in red.

plt.show()

This just displays the graph

****

So that's my understanding of the above - anything glaring that I'm missing?

Any reason that it's an ellipse and not actually a circle?

Thanks very much.

Also - I was considering getting a book - I'm not sure what your thoughts are on that.

I was thinking about this one or maybe this one. Perhaps this is way too basic to warrant a book. Though It would be nice to continue learning as I move through onto A Level material (maths) as well.

Cheers!

u/JackDracona · 1 pointr/learnmath

The best book I have found is Statistics in a Nutshell by Sarah Boslaugh. I found it surprisingly thorough, yet surprisingly concise at the same time. It doesn't make things too complicated or go into too much depth of the mathematics behind all of the topics, but it doesn't condescend or "dumb it down" either. And, it's a fraction of the cost of textbooks that don't do as good a job covering the same subjects.

http://www.amazon.com/Statistics-Nutshell-Sarah-Boslaugh/dp/1449316824/

u/Evilution84 · 1 pointr/rstats

I'm a REML mixed-model advocate and highly recommend this book http://www.amazon.com/Mixed-Effects-Models-S-PLUS-Statistics-Computing/dp/1441903178