Maths/Stats Question

Jonathan Bryant · Post by **Jonathan Bryant** » Wed Apr 13, 2016 8:18 pm

A questions for those more competent than me when it comes to maths and stats (i.e. 99.999% of everybody).

I’m trying to establish the relative frequency of THING X happening in two groups of unequal size.

Let’s say,
In Group One, THING X happens on 10 of 20 occasions.
In Group Two, THING X happens on 5 of 30 occasions.

Given that to reach 50% in Group Two, THING X would have to happen on 15/30 occasions rather than 5/30 occasions, is it fair to say that THING X is three times more likely in Group One than Group Two? Or is it more complicated than that?

IM Jack Rudd · Post by **IM Jack Rudd** » Wed Apr 13, 2016 9:50 pm

Your samples are too small for you to state that with any confidence.

NickFaulks · Post by **NickFaulks** » Wed Apr 13, 2016 10:49 pm

Three times is a best estimate, though. Without doing the sums, I'd say that the true figure is pretty likely ( deliberately vague ) to be between two and five.

Jonathan Bryant · Post by **Jonathan Bryant** » Wed Apr 13, 2016 11:03 pm

Thanks to you both.

For the sake of argument let’s multiply the numbers then.

So
in Group 1 THING X happens 100 of 200 times
in Group 2 THING X happens 50 of 300 times

or even 1000 of 2000 and 500 of 3000.

In all cases in Group 2 the proportion is 1/3rd of 50% which is what you get in group 1. So is it now reasonable to say THING X is 3 times more likely to happen in Group 1 condition than in Group 2 condition?

And how big do the sample sizes have to be? This is Confidence Interval stuff, isn’t it?

NickFaulks · Post by **NickFaulks** » Wed Apr 13, 2016 11:13 pm

Yes, you're definitely homing in on 3 and it is about confidence intervals. It's hard to give specific numerical answers without a completely defined question.

MartinCarpenter · Post by **MartinCarpenter** » Wed Apr 13, 2016 11:32 pm

I seem to vaguely remember having an entire inductive logic course about the 'right' answer to this sort of question

Confidence intervals could be all over the place depending. If they really are binary (not continuous) and truly independent events for each trial then much easier.

Paul Dargan · Post by **Paul Dargan** » Thu Apr 14, 2016 5:40 am

Yep, convert both to percentages (so 50% and 16.6%) and compare the numbers.

Then you would need to think about the questions others have raised to decide on the confidence interval. Formally what you are aksing is whether the confidence interval around the difference between the two numbers includes zero ... though in practice people often just look at the confidence interval of each number and see if they overlap.

Paul

Brian Valentine · Post by **Brian Valentine** » Thu Apr 14, 2016 7:53 am

Jonathan,
I am going to expand a bit on both Jack’s and Nick’s points.

My first point is to ask whether you have set up the question correctly. Its unusual to need to ratio probabilities and a proper solution needs to take into account the downsides of getting the estimates wrong.

Then there is what is happening in each trial. Let’s attempt an analogy: assume you have a large pot full of multi-coloured marbles and you are interested in picking out red ones.

Do you know anything about the number or the proportions of marbles?

Do the marbles picked out get put back or put aside?

In the second trial:
Are you using the same technique?

Is the pot the same?

If not, does the new pot have similarities with the first pot?

We might then be able to say something, even if it is framed in some uncertainty.

Brian

Jonathan Bryant · Post by **Jonathan Bryant** » Thu Apr 14, 2016 2:07 pm

thanks for the further responses.

This is the question I’m trying to answer.

A month or two ago I was listening to an interview in which the subject (The Coen brothers) claimed that mainstream American films were much less diverse in terms of the characters portrayed on screen than films made elsewhere. I wanted to see if this was true of the films that I’d watched in year to date.

So - leaving aside the accuracy or otherwise in my allocation of films to certain groups - I found that

10 of 20 non-mainstream films had a lead character who was female
5 of 30 mainstream American films had a lead character who was female.

So what I wanted to know was, for the films that I have personally seen this year so far, was it fair to say that non-Hollywood films were three times as likely to have a female central character as Hollywood films. Or do proportions/percentages not work like that (i.e it’s not the same thing as saying 6 is three times as big as 2).

Of course, this is all only of interest to me because it’s only the films that i’ve seen. Thereafter I could look at my sample and think about what it tells us - if anything - about the whole population of films released this year, but that’s a secondary question for me at the moment.

thanks again for the input.

MartinCarpenter · Post by **MartinCarpenter** » Thu Apr 14, 2016 3:41 pm

Well, expecting mistakes, the statement that the ratio was 3-1 in the films you've seen is of course true

Extending it to all films wouldn't be especially fair. Small sample sizes, and even smaller actual values. The precise ratio is very volatile - one more female lead character in the American sample and it is down to a 2.5:1 ratio for instance.

It is obviously evidence that there is more, how strong I'm not sure. Might not hit the typical requirements of statistical proof for even that.

There's also the matter that you might be biased in your selection of films. Maybe you don't think Hollywood can write lead parts for women!

Michael Farthing · Post by **Michael Farthing** » Thu Apr 14, 2016 4:00 pm

It seems as sound to me as some of the identical twin experiments of yesteryear! [Unless, of course, Jonathan has invented some of the films and they don't exist

]

(I suppose I'd better explain this:

Sir Cyril Burt investigated identical twins brought up together, or separated at birth, with the aim of deciding the famous nurture or nature debate about intelligence. Unfortunately, finding identical twins separated at birth is a non-trivial activity and later investigations accused him of simply inventing such individuals!)

Setting this aside, on the assumption that Jonathan has seen the films claimed, his sample size is not miles different from Sir Cyril's.

Sean Hewitt · Post by **Sean Hewitt** » Thu Apr 14, 2016 4:24 pm

Jonathan Bryant wrote:thanks for the further responses.

This is the question I’m trying to answer.

A month or two ago I was listening to an interview in which the subject (The Coen brothers) claimed that mainstream American films were much less diverse in terms of the characters portrayed on screen than films made elsewhere. I wanted to see if this was true of the films that I’d watched in year to date.

So - leaving aside the accuracy or otherwise in my allocation of films to certain groups - I found that

10 of 20 non-mainstream films had a lead character who was female
5 of 30 mainstream American films had a lead character who was female.

So what I wanted to know was, for the films that I have personally seen this year so far, was it fair to say that non-Hollywood films were three times as likely to have a female central character as Hollywood films. Or do proportions/percentages not work like that (i.e it’s not the same thing as saying 6 is three times as big as 2).

Of course, this is all only of interest to me because it’s only the films that i’ve seen. Thereafter I could look at my sample and think about what it tells us - if anything - about the whole population of films released this year, but that’s a secondary question for me at the moment.

thanks again for the input.

Did you choose the films that you saw, or did you somehow see random films? If the former, then you really can't draw a conclusion because the difference (assuming there is one, and also assuming that you've got a large enough sample size) could be down to your choice of films rather that the leading actor gender choice of the various writers / casting directors.

Jonathan Bryant · Post by **Jonathan Bryant** » Thu Apr 14, 2016 4:34 pm

Sean Hewitt wrote: Did you choose the films that you saw, or did you somehow see random films?

A bit of both and neither.

I’m seeing anything that Everyman Cinema puts on. So I’m not choosing. But then it’s not random either because obviously the cinema chain aren’t choosing at random, they’re trying to profit maximise. Also, I do choose a bit, in that I’ll go to Barnet for The Last Man on the Moon, but not My Big Fat Greek Wedding 2.

Obviously there’s an issue with drawing conclusions about the population of films from my personal sample for all sorts of reasons. I was more interested in how percentages relate to each other. I’ve done some stats courses in the past and I remember getting tripped up on stuff like this - although I don’t remember well enough to be sure about anything.

Kevin Thurlow · Post by **Kevin Thurlow** » Fri Apr 15, 2016 10:15 am

So Zootropolis has a female lead character? No reason why not, as the cartoon is allegorical.

Jonathan Bryant · Post by **Jonathan Bryant** » Fri Apr 15, 2016 5:26 pm

Kevin Thurlow wrote:So Zootropolis has a female lead character? No reason why not, as the cartoon is allegorical.

Yes. She - Judy Hopps - is a rabbit, but she’s clearly a female rabbit.

Obvs that rather highlights the the absurdity of trying to demonstrate something using only numbers without making any attempt to make qualitative judgements. One Zootropolis rabbit is hardly, the equivalent of Victoria (from the German indy film of the same name) or Marguerite (from Marguerite), for instance.

English Chess Forum

Maths/Stats Question

Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question

Re: Maths/Stats Question