Praveen Rai on the challenges of polling in India and why he thinks seat predictions should be banned

Praveen Rai

Praveen Rai is the Academic Secretary at the Centre for the Study of Developing Societies (CSDS) in Delhi. Prior to working as Academic Secretary, he was a Project Manager at Lokniti, Programme for Comparative Democracy, a research programme of CSDS that specializes in election studies (full disclosure: CSDS is my primary research affiliate, and I am working with Lokniti on my research). While at Lokniti, he handled more than fifty election studies and opinion polls between 2005 and 2009.

His research interests include electoral politics, monitoring election opinion/exit polls and political participation of women in India. His seminal work “Electoral Participation of Women in India: Key Determinants and Barriers” was published as a special article in Economic and Political Weekly in 2011. He also co-authored a book, Measuring Voting Behaviour in India, with CSDS director Sanjay Kumar that was published by Sage in 2013. His articles have been published in Indian academic journals, edited books, newspapers and blogs. In much of his writing, he has been critical of the use of opinion polls as tools of political communication.

Last Wednesday, I sat down with Praveen in the CSDS lounge to talk about his experiences managing polls with Lokniti and his criticisms of media reporting on opinion polls. The transcript of the conversation below has been lightly edited for length and clarity.

On his involvement in public opinion research

Sam Solomon: Please tell me about about your background conducting public opinion research in India and how you got into this field.

Praveen Rai: I am basically an LLB. I have done my master’s in history from Delhi University and my law [degree] from Delhi University. I have worked in different private companies, small and big, looking after legal affairs. But somehow that job didn’t interest me, being a very routine thing. So I ha[d] been trying to make a career switch.

I just got a chance to come to Lokniti while I was looking for a job. I was just working here as a research assistant. But while working on that slowly I developed interest in opinion polling.

SS: You didn’t have the interest before though.

PR: No. In fact, I had been interested in politics and looking into election surveys on television and newspapers. That used to interest me. But I had never thought of getting into this area.

So I came here as a research assistant on a very temporary contract looking for a job, making a job change. And slowly I got into this. So I started right from the ground by doing the first time exit poll, which went wrong for me, as a field investigator. And then I did a few more fieldworks, and then slowly I got into questionnaire designing. So with a period of time I got into this field more by default in the sense that that was not my job. But I started learning , right from questionnaire designing to conducting fieldwork to managing large projects, and then data analysis and all that. So slowly I got into this. I started writing about it. I got an opportunity at Lokniti. And over a period of time I developed an interest in this.

So I think I worked almost four years where I was the project manager to Lokniti, and I think I conducted more than fifty or sixty opinion polls. That included election studies and studies in democracy. And different kinds of opinion polls. The Prime Minister’s survey, to find out what are the opinions that people had of Prime Minister Manmohan Singh after his first term. In fact, during his term in three years. So slowly I got into that but then I learned the whole thing and it became an interest area for me.

SS: Over what time period is that? When did you start getting into Lokniti and CSDS?

PR: I have a very long association with Lokniti. As I told you, the first one when I came here in 2002 as a research assistant. So I worked there for six months, and as I told you I did field investigation, I did field supervision, about how interviews are done, how student field investigators do. And after that I again went into the private sector where I was handling corporate communications. But there also I conducted some election studies for some political parties also. That was a learning stage for me.

In 2005, when Lokniti got into an informal contract with CNN-IBN for doing all the election studies and different opinion polls, that was the time I got the invitation from Lokniti. In October 2005. Then I came here. But by that time I was managing projects, I knew how to do field investigation, supervision, and all those things. And from 2005 to December 2009 when I was here, I got a chance to get into questionnaire designing, data analysis, and then I started writing. Now I’m all about opinion polling right from field investigation to putting the reports out.

SS: You have an office at CSDS. Are you still affiliated with Lokniti?

PR: No. In 2009, I got into my current assignment, that is like work as an academic secretary. So my role is a little different. To look about academically and liasoning with funders. The reason why I got here was this is a permanent job. In Lokniti, that was a contractual thing. Every year it used to be renewed. So at the moment I am not actively involved in opinion polls in this sense.

But still I write. If you see over the period of the last four or five years, it has changed in a sense. Now I don’t do much quantitative writings. I think everything is finished. All my writings and book chapters. And then there was a book also which I wrote with Sanjay [Kumar, Director of CSDS]. it’s called Measuring Voting Behavior in India.

Since last year I started tracking opinion polls. The way opinion polls and findings– in fact, now ten or fifteen agencies do [opinion polls]. So five or seven agencies say this is winning, and five or seven agencies are saying the other parties is winning. That confused me a lot. So then I got into writing about opinion polls. I tried to find out, I tried calling them up to find out what is their methodology, where was the fieldwork done, do you have the questionnaires. But the response from the different polling agencies has not been good.

SS: Were some better than others at responding?

PR: No, except for a few agencies like AC-Nielsen and all, which are market leaders in a sense. They are market research agencies who do it in a very, very proper methodological way. In fact, the quality of their surveys are, I would say, as good as Lokniti-CSDS. But apart from that, most of the agencies won’t even spell out who are the people who did the field investigation, what exactly was their sample, was it in residences or on streets. So these are things which are not coming out into the public domain.

Opinions polls are mostly on elections. Apart from elections, you get very few opinion polls. Eight to ten, you’ll get, to see in country which are on other issues. That is mostly done by Lokniti. The agricultural studies, the youth study. Apart from that, most of them are doing [surveys] during elections. Round after round — pre-polls, post-polls, exit polls.

SS: You’ve been involved with Lokniti-CSDS for almost fifteen years, since 2002. I’d like to know how measurement of public opinion has changed over that time, for CSDS and more generally in India.

PR: I told you that since things are not in the public domain, we don’t have a forum where the oldest people who were involved in public opinion polling get together and share those things. Nothing of that sort has developed in the last ten years. Though there have been a lot of people who have been trying to take initiative, I think it is not going. So at the moment I can just speak about CSDS surveys, Lokniti surveys. So I will just speak about the time when I joined and how we changed. And before that we have the history which you can find out from our book where me and Sanjay, we have written a very long chapter on how the surveys in Lokniti have changed.

During my period, in fact, the changes which happened in Lokniti surveys [were] at the stage of getting all the questionnaires here, cleaning it up, data punching, and then doing the analysis. From 2005 onwards, slowly we started decentralizing data entry and data cleaning. We developed it in different states. So now what’s happened is that it’s not centralized in a sense. All the questionnaires from India don’t come here and then we don’t have to put in huge teams with deadlines to do all the data cleaning and data punching for the analysis. So we decentralized it.

Apart from that, also since the number of sample increased — if you see our National Election Study, the National Election Study’s sample increased — so what we started doing was that we had four or five questionnaires. Questions on elections and politics, we had ten or fifteen questions which were in all the questionnaires. But in different questionnaires we started asking people on different issues, like security issues, economy, and all that. That way, in fact, we had more spread of getting opinion about politics, economics, price rise, and all these things. So I think these are the two major changes that have happened in the last ten years in the Lokniti surveys.

But the common thread which runs right from the first survey until now, and that we haven’t changed, that is we still do face-to-face interview of sampled voters in their residence. That is one common thing that is there in the last forty years of CSDS surveys.

SS: Going back to the 60s.

PR: Yes. Starting with the 60s.

SS: My research is looking at sampling error and whether certain populations are more or less likely to be sampled and participate in a survey. I’d like to know whether, in your experience working on Lokniti-CSDS polls, certain populations were more or less likely to be sampled in surveys, because of cultural factors or socioeconomic factors.

PR: In fact, the best part of the opinion polling which you do on elections is that in the the last ten years, all the voter lists are available online. So your universe of the study is– what happens is that you get the list of the complete universe. And we have been using the random stratified sampling system. So what has happened is that we have tested it in the 90s. We have been using those and once we get our data then there are certain indicators. We have matched the profile of the voters which we have done in our survey with the profile that exists in the state or at the all-India level. We have government data on the gender breakup, on religion, on caste and community. By using our random stratified sample, by the end of the 90s we could see that in fact we had been including a very, very representative sample. I think the sampling method in fact does not exclude anyone and is quite representative of the population.

SS: Even with the sampling being properly inclusive, did you find that certain populations had a higher likelihood of non-response and not participating in the survey?

PR: Well, I think not. We don’t have data for that. No, in fact, we have data for that and there is no difference in the non-response based on caste, community, or socioeconomic [level]. Non-response has been for many reasons, like people not living in that area, or migrating somewhere, or the day you reach for the interview they’re out of their village, they’re not in their residences. But there is no difference in the sense that the non-response of rich people is more or the non-response of poor people is less. It’s more or less uniform.

SS: What about for men or women? For religious minorities?

PR: For that we also don’t see any differentiation because what we do for the Lokniti surveys — you know how the network operates — we have a network of different universities and they’re mostly students who are doing their master’s or their MPhil. They are trained in how to do interviews, how to handle questionnaires. There we keep a gender equilibrium, in the sense that 50% of the investigators are men and 50% are women. Female investigators mostly go and do interviews with females because in villages or even if you live in religious communities, like Muslims, for a male to go and interview is a problem. So we have been using female investigators for the interview so we don’t have any problem of non-response from a particular section like Muslim women. Even in north India where you have the purdah system, where a male cannot go and do a face-to-face interview, but definitely the females goes to do [the] interview. So we have taken care of this. I don’t think there is any high non-response.

SS: In your article for Economic and Political Weekly, “Fallibility of Opinion Polls in India,” you write about four types of errors: coverage error, sampling error, measurement error, non-response error. Which of these four types of error do you think presents the most significant challenge for Indian researchers? If you have specific examples you can refer to in your time working for Lokniti or CSDS, that would be most helpful.

PR: When I am talking about all these errors, then definitely it is there in most of the surveys. But you have to see how high it is. Even if you see a non-response error — if it goes to three or four or five percent — it doesn’t make much difference in the sample, because the non-response of people… It’s not like the non-response of male respondents are more or of female respondents are less. I think non-response error does not create much problem. I am talking specifically for the Lokniti surveys. After that, I’ll talk about the marketing or opinion polls. In fact, all these errors are within the limits and they cancel each other out and they don’t have [an] effect on the total sample.

But now as far as the market research opinion polls which are coming to the public domain in the last four or five years, because of which I wrote that article, they are not coming out with the proper methodology. They are not telling us exactly who are the people who did [the survey]. I think the main reason they are getting it wrong could be, one, is the sampling method. Because a face-to-face interview on a proper random sample is a very, very costly proposition in India with the kind of length and breadth of the country, and very interior areas. So I think one of the major areas where most of the polls are going wrong is on sampling.

Even if they get their samples correct and their field investigation correct, the second error could be the measurement error because getting the right vote share… [between] the time the vote share of a survey which you do and the election results are declared, the vote shares are within the limit of one or two percent. It doesn’t make much difference when you translate it into seats.

But the kind of results which we are getting — as I told you, five saying Aam Adami is going to win and five saying BJP is going to win in [the 2015] Delhi assembly elections. And it was a clean sweep. Just three seats for BJP. So there in fact it will be very, very hard to tell that because of these errors the surveys were wrong because with huge margins of the vote share, no survey can get it wrong. There comes in what I talk about, the vested interest of people in the media, their manipulating seats to create a bandwagon effect before the elections.

On the use of polls as tools of political communication

SS: In the article, you used pretty strong language, talking about how election surveys have been reduced to a “media gimmick.” If that wasn’t the situation in the past, why do you think that has happened and what can be done to change that?

PR: As far as the bandwagon effect is concerned, I think it was way back, around twelve or fourteen years back, when a series of surveys were done by CSDS and they used to come out with different report cards for MLAs [members of legislative assembly] and that was there in all the Hindi and English newspapers. A lot of people read it — we cannot say exactly what percent — but most of the people were aware of this survey. In fact, it came out daily for one month. So in the last round of surveys when they did all these MLA report cards, we had put a question, how many people have read about it? We were very surprised that just seven or eight percent of people had read about all those MLA report cards. And it was coming out in all the Hindi newspapers and English newspapers. But only six or seven percent were aware of it, the MLA report cards in the survey. And those who had read about it or heard about those polls, we asked them another question, what effect did those readings have on voting for a particular party? Did it change your original decision? They said no. Whatever decision they had taken was taken by us and the poll had no effect. The findings of our poll had no effect.

This is way back, I think, 2002. In the last eight to ten years — in fact, I am talking about pre-poll surveys just before the election — when you have a series of elections, say, OK, BJP is going to win, Aam Adami Party is going to win. Even Aam Adami Party, they also did a survey for the first [2013 Delhi assembly] election where they had spoken about getting something more than fifty seats. I think they got something like twenty-three or twenty-four seats. There also they had used survey findings done and Yogendra Yadav had joined the party at that time. They were saying that they were going to get more than fifty seats. And on the other hand, Congress was going into the elections. Because they were the incumbent party, they were saying they were going to get a majority. [The] BJP did their own polls and also said, “We are going to get a majority.” Now these are the three main parties and all three are doing that. But when the election results came out, the BJP had thirty-three seats, [AAP had] twenty-four seats, and Congress was completely wiped out [Editor’s note: In the 2013 Delhi legislative assembly elections, the BJP won 31 seats, AAP won 28 seats, and Congress won 8 seats, leading to an AAP-led government with the support of Congress].

So that means all the surveys that they did and all the findings which [were] put out, those were just to solicit votes. That was to create a bandwagon effect, that, “OK, now we are winning.” In fact, when all these reports were out, they were putting out banners and all those things. They were doing a lot of publicity.

SS: But based on that and what you just said, it seems like there isn’t a bandwagon effect then.

PR: No, this is something conjectural until the time we do a study to find out that people are aware of those findings and based on what they’re finding, “OK, Aam Adami Party is going to win and then all the traditional BJP supporters are voting for that.” So this is just conjectural because a study has to be done and definitely none of the polling agencies are going to do that. So I think the only one who can get into this and do a proper study to find out whether we have a bandwagon effect or not is the Election Commission of India. So that is what I have been telling them. Just ban seat predictions. Seat predictions. I’m not talking about banning surveys. Surveys are very, very important. Election studies are important.

The reason why I wrote in EPW is the fact that the purpose of doing election surveys from a instituit . Political parties have also been doing that. But the purpose is now to find out what is the vote share they are going to get. Will this candidate do better? So they are using polls to find out how they are placed, how they can do that, what kind of candidate should do that. So the purpose was that.

So I think until that stage, until the stage where the Vajpayee government was there, before that, in fact, they had a minister called Pramod Mahajan, the first to come out openly and admit, “OK, we also do opinion polls but the purpose of doing our polls is to find out how we are placed in a particular contest, what kind of candidates we have to put in.” He was the first politician to admit that they do their opinion polls.

Until that stage, that’s not a problem. But doing an opinion poll, fudging data, putting out reports to say, “We are going to win,” that is not a purpose of opinion polls. That is not a purpose of election studies. And that is basically telling lies. The purpose of opinion polls in India were to find out the mood of the voters, like what kind of assessment we have of the incumbent government, what are the issues they think are important, and whether they have formed voting decisions until now or whether they are going to make voting decisions once the campaign ends. That was how opinion polls developed since the 1990s.

But I think over the last five years now, whatever they are doing, the most important thing is to just tell the people who is going to win. So now they are using it as a political communication. That is giving an extremely bad name to the polling agencies. And though we [CSDS] do a different kind of work — our purpose is to do election studies, do evidence-based reporting of how a party fared, do post-mortem analysis, and [make] data available for researchers who come in — at the moment everybody is clubbed in. Lokniti did seat projections for a couple of years and we dropped them because it was not scientific and it was not working well. But still people feel that, “OK, CSDS is also saying that.” So people have now started looking at opinion polls, they think it’s all biased and being done by interests to further that.

On why seat forecasting should be banned

SS: You make an argument that seat forecasting should be banned. You think the reason is not just because there’s a lack of transparency with how they come up with the seat forecasts, it’s [also] that the parties are using these seat forecasts to generate some kind of bandwagon effect that will help the party. And it’s not being done using scientific examples.

I spoke to Rajeeva Karandikar, who used to do the seat forecasts for CSDS. He’s pretty open about how he does the seat forecasts. He doesn’t give you all the specifics, but he kind of walks you through the general principles. He wrote a big article for The Hindu Centre [on this].

If more seat forecasts were done in a way that a very detailed methodology statement was included with the forecast, do you think that would be okay? Or do you think it will still be a form of political communication that political parties would use to manipulate public opinion?

PR: What has happened with opinion polls is that seat forecasting has taken the center stage. If we get a report out in a newspaper on an opinion poll on elections, the only figure that people want to see is who is winning, how many seats. As I told you, whether it has a bandwagon effect or not, most political parties think that it is a part of a communication campaign. So they have started using this for doing it.

I have [said] that the Election Commission has taken a very, very harsh view in the last five years. Nothing much has been done. A time had come when they were talking about banning election surveys, in fact, before the elections. Now they have a time period. Once the polling process starts until it ends, that is a time you can do your survey but you not come out with any reports. But just before the election process starts, you can do as many polls as you want to do, as many seat projections. Instead of banning election surveys, what I have argued is that seat prediction should be banned or else the other alternative can be what happens in Japan.

In Japan, seat projections have been banned. What had happened is that earlier they used to do seat predictions but when it went wrong, people went to the courts, the political parties went to the courts and filed a case, and people who did wrong forecasting were penalized heavily. The courts there said that if you get your predictions wrong, you have to pay for that. So seat projection doesn’t happen. At the best, they do a lot of election surveys and opinion polls, but they talk like, “BJP is leading and Aam Adami Party is second or third.” They talk like this, but they don’t assign any specific figures about, “This is the number of seats they’re going to win.”

So I think that would also be one way to deter media. In fact, if a political party does something and then does it through media, you know, an India Today poll… What happens is you’re masking your poll. Ultimately [the] media has to be made accountable, because television is  a very, very important way of reaching the homes. And elections, politics in India, it creates more passion as compared with other countries. (laughs) People are political in that sense that they are interested. And when a media house puts out something it’s like putting a stamp on that survey. So either you put a huge penalty that if your prediction goes wrong — you cannot just say that BJP is going to win a two-thirds majority and when the results come out you find that the opponent side won [a] two-thirds majority. You can’t be so wrong. Either you impose a huge penalty on the polling agency and the media house which is showing it so that they stop doing all these gimmicks and all these bandwagon effects. Or the second and easier method could be that we just tell them which party’s vote share they can talk about, which party is leading, which is winning, but don’t give out those numbers.

SS: But don’t you think that the vote share would then become… It seems to me that even if you banned seat projections, the political parties still might try to gin up bandwagon effects by saying vote share: “According to this CSDS poll, we have 45 and these guys have 41.” I feel like vote share could also be a form of political communication, even if the seat share were banned.

PR: No, I agree with that. Once you ban the seat projection, people can come out with the vote share and they can talk about it, “This party is going to win the election and get a huge majority.” All those kinds of things will happen. So I think there has to be some body, the Electoral Commission of India has to set up one unit whose work would be to see what kind of media reports are coming out. And if somebody says that, “NDA is going to get a majority, and BJP will  get a majority but they will be the largest party,” but after the election you find the BJP gets a majority on their own. That does not amount to much tinkering of data or misuse of election surveys.

But you can’t do a projection that NDA is going to get a two-thirds majority, and when the results are out, you find that the opposite party has got that. So when they do that, I think the Election Commission has the right to get all the data and find out whether the survey or fieldwork has gone wrong, the vote share they have computed has gone wrong but the survey is correct, the vote shares are correct and they have manipulated the data, and based on that some punishment needs to be enforced if they want to restore the faith in election studies back. Because it has taken a very, very bad beating since the last 2004 national elections, they are getting it wrong.

And even if you see this election [2014 Lok Sabha elections], because I did not do a follow-up article on this, here also in fact nobody spoke about a BJP majority. They all said NDA will get a majority. But nobody could in fact that BJP is getting a majority. So that means the vote share which you got in your survey which you are showing, there’s some problem. And the problem can be of fieldwork, of data collection, wrong sampling. At the moment, we are just grappling in thin air. Whether the fieldwork is wrong or the fieldwork is right and people are manipulating at this stage, the media are manipulating. So I think the responsibility needs to be fixed, either with the polling agencies or the media houses.

SS: And do you think the responsibility for changing this, the culture around polling in India, that falls on the Election Commission of India?

PR: That’s the body which puts a stop to election studies because once elections are declared, most of the powers are with the Election Commission of India. The last one which we had, whatever orders they had were [carried] out. As I told you, the day the election notification takes place, from that day until the day the last vote is cast, that is a period they have said you cannot come out with a report of opinion poll. Even that is ambiguous in a sense. You can do your survey but you cannot come out with those reports. And I think that is being followed by most of the polling agencies and media houses.

So I think the Election Commission of India should get a study done to find out the bandwagon effect, so that it can tell the political parties, media houses, and polling agencies, “Your reports [don’t] make a difference.” Though I am just being conjectural. Maybe it makes an impact on the voters and some voters are changing because of the election surveys. And based on that the responsibility needs to be fixed so we have fairness and transparency in opinion polling to restore its image. Because at the moment it has taken a very, very bad beating.

SS: Have you heard about the Indian Polling Council initiative that’s supposed to be launched in the next month or so? There was an article in The Hindu about it.

PR: I am not aware of it.

SS: In the article, some pollsters — in the article, Yashwant Deshmukh and CVoter are the ones that are quoted — CVoter and five other polling agencies are thinking of coming together and starting their own group to come up with some standard guidelines for disclosure of methodology and transparency in terms of reporting on their polls. The idea is that if you belong to the Indian Polling Council, you have to report things like sample size, sampling methodology, etc.

PR: If this is happening, I think it is a very good initiative in a sense because what I feel is that instead of the Election Commission of India drawing guidelines for do’s and don’t’s for polling agencies, if they all come together and form their own thing where they share their things, and they can come out with reports, I think it would go a long way of restoring the credibility of opinion polls in India. So it’s a good initiative and if it happens, it will definitely help.

On the challenges of polling in India

SS: In your own experience — you’ve worked on fieldwork, you’ve worked on questionnaire management, you’ve worked on all the stages of opinion polls — what are the biggest challenges that you found in accurately measuring public opinion in India?

PR: As far as challenges are concerned, there are a lot of challenges in the sense that this whole face-to-face interview is a very, very hard task. There are villages where you have to walk for kilometers on the hills to reach them. Lokniti does it, but the cost involved of doing a face-to-face interview is very, very huge in this country. That is one of the biggest challenges, and I feel for most polling agencies they don’t get enough funds to do this.

Secondly, doing a telephonic interview in India is still not possible because the mobile phones’ reach has increased but still most people don’t have the telephones. So making a shift from face-to-face, compared with US and other European countries where face-to-face interviews are very, very less and most people do telephonic interviews and now using different platforms [the] Internet. So I think this presents a huge challenge.

Apart from that, as you know we have a very, very diverse country. Different languages and all that. So if you do a survey at an all-India level, like Lokniti, we translate a lot of questionnaires into state languages. For most of the states, we are doing that. But for other polling agencies, we are not sure whether they are employing, like if you want to do a survey in Karnataka, you need to have a researcher from Karnataka and you need to have the survey in the local languages. That is one of the big challenges and this is also [a] huge cost. So whether these polling agencies are doing it or not, that I am not very sure about it. At least Lokniti does it, but still it’s a major challenge because even if you see a language, even in a state, there are so many dialects, that translating the questionnaire into dialects and getting the investigators with those dialects to do the interview, that is a huge challenge.

In fact, these are the most important challenges of doing an opinion poll in India.

SS: Anything else you’d like to add?

PR: I’ll add an example of the kind of the challenges you face. As I told you, from 2005 until 2008, for three years Lokniti-CSDS, we did seat predictions. And that was mostly for assembly elections. So I think we had nine assembly elections during that time. We got seven right, the [election results] were exactly within the range. But we got one completely wrong and one partially wrong.

The election had taken place in Punjab. There also the fieldwork and everything went off well. We were using exit poll data. With the exit poll, what happens is that the moment the polling ends, say the polling ends at 5 o’clock, the television program has to go live exactly at 5 o’clock and by 5:30 you have to tell exactly which party is winning. So that means that you have to close your fieldwork by say, around 3 PM, because then you have to transmit data and all those things, and analysis has to be done, seat prediction has to be done. You get just two hours. So with Punjab we did that. It stopped at 3 o’clock and we got our data. We did the seat prediction and all.

But what happened is that the election was extended from 5 until 9 PM because there were heavy rainfalls that took place. Here all the predictions and television programs were on since 5 o’clock. We collected our data until 3 o’clock. And then the rain stopped at around 5 or 6.

Congress and the Akali Dal, they were in a very, very strong contest, just a margin of two or three percent. In fact, the Akalis were winning. So what happened is the rain stopped. A lot of Akali voters, they put their voters in their buses and they went and voted at 9 o’clock. Now for the exit poll we closed our field investigation at 3 o’clock. So our data was loaded in favor of the Congress. So we mentioned that Congress was going to win. But actually it was Akali Dal which won.

So we predicted something, it went wrong. We did a fair investigation to find out how come we got it wrong. Even for exit polls we do purposive sampling to make it representative. So we missed the whole, the Akali Dal voters go when the rain stopped at 4 o’clock. They went to the voting. And the voting was extended until 8 o’clock in the evening. That also, people didn’t inform us. So a huge section of people in a lot of places voted at that time. It’s a very strange thing. With all the Akali voters, what they do is that they get buses and tractors and all those kinds of things, where they go together and vote. So this is one challenge which nobody foresaw. And we got it wrong.

SS: What year was this?

PR: You have to check. This was Punjab elections. I think, Punjab elections 2007 or 2008. You will get that if you do a search [editor’s note: Punjab Vidhan Sabha elections were held on 13 February 2007].

SS: That was the one seat projection that you got way wrong, and there was another one which you said you got partially wrong.

PR: The other one where we couldn’t predict, I need to check. Was it Tamil Nadu? No, Tamil Nadu, we got it right. The other one I don’t remember. But two we got wrong. In the third, also, in fact with Tamil Nadu, what happened, as I told you, is that you have some error, the coverage error. There was some errors. But what happened is that the errors for non-response and the errors for non-coverage, both cancelled each other out. So we got the vote share correct. But if you see the survey, and if you see the initial data that was coming in, there was some problem in fieldwork. We couldn’t get the fieldwork done properly. It was a problem with investigators, with fieldwork, with supervision. There were logistical problems. We did not get the vote share. But one mistake, cancelled by the other mistake, and we got the vote share correct and we made the right prediction (laughs).

SS: And what was this for?

PR: This happened in Tamil Nadu, this was also between 2005 and 2009 [editor’s note: Tamil Nadu Vidhan Sabha elections were held on 8 May 2006].

SS: For Vidhan Sabha?

PR: Yeah, for Vidhan Sabha. Our prediction was perfect. But if we see our data, and what had come, and what were the errors, as we do a review of all our surveys once we do [them], and mostly during seat projections.

So two wrong, and third one we got it right by default. So we decided that doing seat predictions and vote shares, it’s a very, very tough situation in India. And so we stopped. I think 2008 we stopped. 2009 Lok Sabha elections, we did not make any prediction.

And then there are some interesting cases. In one of the Karnataka assembly elections — not the last one, the one before that — I very well remember that Congress had got 32% of the vote and [the] BJP got 30% of [the] vote, but [the] BJP got the majority of seats and they formed the government [editor’s note: In Karnataka’s Vidhan Sabha elections held over two phases in April 2004, Congress won 35% of the vote share and 65 seats, while the BJP won 28% of vote share and 79 seats].

SS: Because their vote share were spread out in a way that was more advantageous.

PR: Yeah. The Congress vote share was completely spread out and they were concentrated in different assembly constituencies.

I think that was also one phenomenon that we first saw in Karnataka, because before they we hadn’t seen in any elections, having more votes but getting less seats. Karnataka gave us that lesson.

And apart from that, if you see the case of Bihar, so [in] Bihar they were completely new political coalition formations. If you see Bihar, first of all, there were so many parties, new coalitions. When you have new coalition partners, you don’t have any record of the previous vote shares. You have nothing. And how that coalition works on the ground, the vote transfers and all that. That is very, very difficult to ascertain through an opinion poll. Until the time that you don’t have some quantitative study done to substantiate your quantitative thing, you can never get a seat prediction.

Where there is just two main parties in contest like it happens in Tamil Nadu, a couple of states, it happens in Rajasthan where [the] BJP and Congress is there, it happens in Uttarakhand, there once you get a vote share [which is] correct, like 45% for Congress and 38% for [the] BJP, you can easily do a projection based on that. And mostly you get it right.

SS: But it’s more difficult when there’s more parties, like Bihar or UP [Uttar Pradesh].

PR: These are the challenges, and how the model is going to take into account these things, for that, we say that until the time polling agencies don’t come on a single platform, they don’t share their prediction models, and if they’re not going to improve upon that, they are going to be more and more wrong.

SS: And for the Karnataka poll where Congress got a larger vote share, but BJP got a larger seat share, did CSDS do a seat projection for that election?

PR: I think we did that and maybe that was one survey we got wrong.

SS: You had the vote share right but the seat projection didn’t predict that [the] BJP would end up winning more seats with less votes. OK, I’ll check that.

PR: That you can check. I think we have it on our record though I don’t have it because we did a complete check of our seat predictions.

It started in 2005, exactly with the Bihar elections. We got it right.

SS: The seat projections.

PR: The seat projections. For all the political formations, we got it correct. We got the vote shares correct. That was the first seat prediction we had done that was really celebrated for getting it right. And then it went on for a couple of years. It was 2008 or something, I think, when we got two or three wrong.

SS: This was the year of Punjab and Karnataka.

PR: Yeah. Punjab and Karnataka.

SS: Wouldn’t you say, if you got seven out of nine right, that’s a pretty good record, right? Maybe you think statistically because of some of the errors, because some of the challenges of polling, you’re going to get a few wrong, but the overall track record of seven out of nine, or whatever it is, that seems pretty good. I guess it’s a matter of focus on the fact that we got seven out of nine right, or we got two out of nine wrong.

PR: No, but see, if you see the pre-poll surveys which we have done — at the moment we are just doing post-poll surveys — you can see the data that we have got in all the pre-poll surveys. You will not find a single poll where you find the actual vote share in that election — if Congress got 40%, then we have got 30% — it was never so wide. Our vote share would come to around 38%, or maybe 1% more. It has always been in the range of 2%. Plus or minus 2%. In fact, the vote share has never swung more than that.

SS: Is this for post[-polls] or pre[-polls]?

PR: Pre-polls. Just before the elections. In fact, we have got it within the range of 1-2%. So you can just imagine that we have got the vote share 95% of times right. So that means that the sampling method which we are doing, the field investigation we are doing, our systems are perfect in the sense that we get it right.

But even with those right vote shares, when you do a seat prediction — as I told you, we got six or seven right out of nine. One or two we got it right by default. By default, one error cancelling the other error, as I told you. That means that that system is still not completely developed. Either you do a lot of research and development, when you can put in all these variables, like concentration of votes and all these kinds of things and different vagaries different coalitions. Either statistically over a period of time, and if somebody does that, but at the moment I don’t see anyone.

Because even if you see a new polling agency which just came up, its name is Chanakya, Chanakya got it right in 2014. You will not believe. They got the seats right for BJP, okay, and all the polling agencies missed it. Everybody missed it. And I think everybody laughed also again. BJP will get a majority. Everybody was laughing. “Who is this Chanakya?” and all that. And then Chanakya got such a big name. And then again have started floundering. Delhi, they have got it completely– Bihar, they have got it completely opposite. Now people are laughing. They are saying that Chanakya by mistake, instead of putting UPA, they are putting NDA with some grammatical mistakes [editor’s note: Today’s Chanakya exit polls showed an NDA landslide. Following the announcement of election results, they said their prediction was off because a computer error had mixed up the names of the alliances]. (laughs)

SS: Yeah. I saw that. Interesting excuse, yeah.

PR: Prannoy Roy is one of the leading psephologists in this country. He was the one who started doing opinion polls through his channel, and all the swing votes, and all that, he made it so popular. So I think Yogendra Yadav and Prannoy Roy, not only are they the best political analysts but they are the leading psephologists. And the way Prannoy Roy got it wrong this time, he had to render an apology. So now this really calls for a question. In fact, people should ask. The media should go and ask him, “What kind of a survey [did] you [do]? Your survey was wrong.”

Because there are a lot of stories going around, and they were trying to show this, push this, that should [the] BJP comes in power, they have all these taxes left and these debts will be paid. They are getting pressures from above.

SS: It sounds like CSDS has a pretty solid track record. You’ve gotten a few seat projections wrong. And maybe that’s because of the methodology, because of the focus on post-polls rather than pre-polls. It seems like when you do pre-polls, you try to do them as close as possible to the election as you can.

PR: The whole purpose of doing an opinion poll for us is it’s a study. In fact, we call it election studies. Our purpose is that. The reason why we started going to the media is that there is a huge cost involved in doing this. Until the 80s, the government agencies provided funds, but after [the] 80s, there was no funds available for conducting election studies. It is still not a discipline in India. It is still a sub-discipline.

So when we had the tie-ups with the media, that helped us in doing our surveys and giving them the required data like they are just interested in the vote share, the popularity ratings of the leaders, those kinds of questions. But apart from that, using those funds, we have been doing a huge, a very, very in-depth study of that election. So the data is with us. So our purpose was that. And when we got into seat prediction, at that time [the] media said, “Since you all are experts in election studies, why don’t you get into this?” And we also went into that to see how can we refine, and the model which Rajeeva Karandikar, I think, is still using is the one that was developed at that time. Then we started doing the seat predictions. But once they started going wrong, we said we don’t want to do it. Because our purpose is not that. Our purpose is to have a set of data on each and every election that happens in the country.

SS: And to understand why the results are happening and not necessarily just predict them beforehand.

PR: Yes.

SS: Thank you very much.

Op-ed in The Hindu

Slowly returning to the blogosphere after a busy week and a half. I have an op-ed in The Hindu today on what pollsters and political analysts can take away from the Bihar polls. Today’s Hindu op-ed page also features a thoughtful piece on the long-term meaning of the Bihar elections by Yogendra Yadav as well as an examination of CVoter’s Bihar post-poll data by Yashwant Deshmukh and Manu Sharma.


Yashwant Deshmukh on the challenges of polling in India and the forthcoming Indian Polling Council

Yashwant Deshmukh is the Founder Editor of CVoter, and a Communications Professional with the working experience of Journalist, Pollster, Evaluation Expert, International Observer and TV News Anchor rolled into one. He founded Team CVoter in 1993, when he was still studying in IIMC. After receiving the UNI award for best research dissertation and for topping the 1993 batch across all streams, his company CVoter was hired by the premier news agency UNI to take care of on-line real time election analysis. Team CVoter continued to grow, and is now one of the largest media and stakeholder research agencies in Asia with expertise in Public Opinion Research & Election Studies. Today more than 120 team members work for CVoter across their 24×7 offices in Washington DC, Dubai and New Delhi.

CVoter is also one of my research affiliates for this project. I have been regularly visiting their Delhi office (which is technically in Noida) to learn about how they do their research. I visited Patna to observe their exit polls during the third phase of the Bihar elections.

Now based in Dubai, Yashwant was in Delhi last week for the Bihar election results. Last Friday, I met with him at Janpath Hotel near Connaught Place to talk about his experience doing public opinion research in India, the challenges he has encountered, and a forthcoming association for Indian pollsters called the Indian Polling Council. The transcript of the conversation below has been lightly edited for length and clarity.

On polling in India

Sam Solomon: Please tell me about your background conducting public opinion research in India. How did you get into this field?

Yashwant Deshmukh: I am a major in journalism, actually. I did my post-graduation in journalism from IIMC. I am a journalist. That is my primary identity and my training.

But yes, I have always been interested in elections, ever since I was a kid. Had a personal history of roughing up with [the] Emergency. I wrote a piece on Huffington Post about my experiences during the Emergency. You might like to pull it out and read it once so that you can get the context of how as a kid I observed the Emergency and I got into the ‘77 elections which was a watershed election in Indian electoral history. Elections were always like something that were good for the people, good for the country, something which und[id] the wrong and [did] the right thing. So that’s the kind of personal background that I was raised on.

And then when I became a journalist, I was — I did my research dissertation on elections only, in IIMC — and then I wanted to join some polling organization as a political analyst. So I didn’t get a job there. I got a job in the campus placement in the newspapers and the magazines. But I didn’t get a job with the polling companies. So when I didn’t get a job with the polling companies, I had to start on my own. (laughs)

SS: What polling companies were around at that time?

YD: At that point in time, there were no multinationals around. There were no multinationals around.

The biggest company at that point in time was ORG and Marg — they later got merged as ORG-Marg, later years. But yes, they were the people who did the fieldwork for the initial polls of Prannoy Roy, which were the path-breaking polls in India, in ‘80 and ‘84 elections, to be precise. ‘84 was the time when psephology got a wider acceptance of political polling, in a way, when Prannoy said that Congress was going to end up with more than 400 seats. He got it right. So they were the people who did the fieldwork. Prannoy and Ashok Lahiri, they analyzed those elections. That was their job. They designed the poll and they analyzed the poll. Prior to that, in India the polling was only limited to what Eric de Costa was doing in IIPO [Indian Institute of Public Opinion], and there was no history to political polling as such.

I think that when I was graduating in the early 90s, that was a very critical and interesting time in Indian political life otherwise. It was the peak of what we call the Ayodhya controversy or movement. The BJP was emerging. The Congress was having problems. In [the] ‘91 elections, unfortunately, Rajiv Gandhi got assassinated. It was a very transitional time when PV Narasimha Rao became the Prime Minister. It was the time when [the] economy was opened up. So altogether [the] early 90s were politically speaking very charged, after [the] ‘89 elections, you know. V.P. Singh, Mandal Commission, Ayodhya, opening up the economy. I think there was too much to consume. It was a thorough transition and transformation that was going on.

And it was a time when I was graduating as well and being a journalist, it was interesting for me to see and map the public opinion on different things. So that’s where I started. I floated CVoter when I was studying at IMC in 1993. I got lucky because I topped the ‘93 batch at IMC, and I was awarded a UNI award for my research presentation on elections. There are two primary news agencies in India, PTI and UNI. And UNI the next year actually offered me to analyze the elections for them as the live counting was done. So that was our first major break. Probably my research dissertation and getting an award on that luckily helped me to get into our first assignment.

And after that, because of UNI, the fIeld was going to more than seven hundred newspapers across the world. Slowly and slowly, we started working for other media houses. The Week was the first one to give us a commission on an opinion poll.

SS: When you say “we,” you’re talking about CVoter, right?

YD: Yeah. So that’s where it started. And one after the other, it kept on happening and it was happening at a very frantic pace because we had ‘96 parliamentary elections, then ‘98 parliamentary elections, then ‘99. So within the gap of three years we had three parliamentary elections in India.

And it was also the same time when satellite TV opened up in India. The first 24/7 news channel opened up in India; that was Zee News. Prior to that, the news operations were limited only to another satellite network called Jain News. Incidentally, Professor Yogendra Yadav also did his first election on TV with Jain News. That was the only private satellite network which was doing news in India at that point in time. We are talking about 1996. Then, Zee was the first 24/7 news network. Zee hired me to do the polls. After that, Aaj Tak came in, then Star, and so on.

Basically, being a journalist, being a broadcast analyst, being a TV guy and a media guy, it kind of helped me to be on the screen because I could not just analyze but also interpret the data. It kind of helped me stabilize CVoter in terms of the operations and other things.

This had been going on, then after that we went beyond political research to socioeconomic research. We entered into disaster mitigation research. Conflict resolution research. That’s where the majority of our work right now is going on. Into the international work as well.

I was always grateful to Robert Worcestor, the founder of MORI, because he allowed me way back in the ‘90s to observe the MORI exit polls in the UK. So I learned that part from there and I applied that as much as possible in India. If I have to pick my godfather– I guess Bob has been to many other pollsters across the globe. I’m happy that I learned from him.

It was also the time when the multinationals started coming into India. AC-Nielsen came in. TNS came in. Ipsos came in. Gallup came in. They were following their clients in India, because the economy was opening up. And it was also a transitional time when the bigger Indian companies were being taken over by the major groups. So all of a sudden we realized that now we are the only Indian company working as such in the private sector with that kind of footprint and operation, volume and scale.

We entered the US five years back. We did the last presidential [election] in the US. Rated very well in our very first operation over there. We did South African elections, bigtime. We did the backward integration for the UK elections as well but we did not take the CVoter brand over there. We were working for some other group over there. Hopefully in the next election, we will enter the UK market as well.

SS: During all this time that you’ve working in polling, how has the polling industry in India changed?

YD: Difficult question. So much has changed and still it feels like nothing has changed.

What has changed is the technology of it. Technology has changed a lot.

We have moved the majority of our operation from face-to-face to CATI [computer-assisted telephone interviewing] now because of the simple fact that India is right now the biggest mobile market in the world. Even though 85% mobile penetration is there now. Actually, 85% is of the total population. If you talk about the adult population, it is at saturation, more than 100%. Statistically speaking. We know that about 20% of Indians don’t really have mobile phone. But yes. It’s huge. And because it’s calling party pays, we started CATI operations seven years back. And now it’s getting standardized. I believe that’s the future. That’s why we invested in that.

Online research in India is still far out because only 15% of Indians technically speaking have the Internet facility, but only 5% of them are actually [the Internet] for social media or other purposes like that. It is just that the volumes are too big. It’s not representative as such. It will still take a lot of time to come. CATI is getting there. So technologically speaking, there ha[s] been a lot of improvement.

From the quality [perspective], unfortunately I actually see a downswing because this place was earlier being observed by a handful of key players. They were open to skepticism and criticism. But all of a sudden with the new electronic media and everything, we see many players who are opaque and the methodologies are not being discussed [in the] way they should have been discussed.

The polling industry as such is getting to a stage where less interaction and less education of the media and the media consumers at large is proving to be detrimental for the health of the research industry as such. The concept of polling has arrived. Many people work very hard for that: Eric de Costa in the initial years, Prannoy [Roy], Dhorab [Sopariwala]. They all worked really hard. Professor Yogendra Yadav. He did his part very well. But it’s trivialized now in the media circuit. The media and journalists at large don’t understand. They don’t understand what to expect from the polls, what are the limitations of the polls. And that is why their expectations from the polls is wrong. And that is where it gets trivialized. And when it gets trivialized at the editorial level, it gets automatically trivialized at the readership level. So that is a big challenge that needs to be [addressed].

But technologically speaking, a lot of improvement has happened. And from that perspective, it is getting better, I would say. It is getting better with time. It is getting better with each and every assignment.

But the expectation from the research industry is wrong. The limitations, the plus points, they need to be more discussed and understood more properly.

SS: You said that it feels like everything has changed and yet not much has change. What do you feel hasn’t changed about the polling industry since you got into it?

YD: What hasn’t changed is the pathetic understanding of the polls by the media. What hasn’t changed is the absolute lack of data awareness among the journalists in this country. What hasn’t changed is the knack of sensationalism in the media scene while they are reporting the polls.

What hasn’t changed is the lack of continuity of the data gathering and analytics, the serious component of the research of the changing trends on on political sectors and indicators. What hasn’t changed is [the] absolute absence of philanthropic funds which should go into the research industry. In the West, right from the start and even today, almost 90% of the socioeconomic research is being funded by philanthropic funds. In India, it is a big zero. So the only funding that the polling actually gets in India is either government or the media. And they have their own limitations on doing trackers and serious socioeconomic research. So that hasn’t changed. There was no funding from [philanthropic funds] back then, there is no funding in it today.

So what hasn’t changed is the lack of serious polling on socioeconomic issues. What hasn’t changed is the lack of continuous polling on serious socioeconomic issues. These are the things which have not changed.

SS: When you talk about the lack of quality coverage and awareness of how to analyze data at the media level, could you talk about the specifically with regards to Bihar? We were talking earlier about the headline today for the exit polls, how it just shows the midpoint, it doesn’t show the ranges [of seat projections]. How do you think the media coverage for Bihar has been as compared with other elections?

YD: Basically, it underline[s] all the things which I just mentioned. If the understanding of the polls and the understanding of the research would have been better, it would have been easier for the media to say that these elections are close to call. If something is close to call, why [does] it ha[ve] to be a matter of ego than one needs to take a call? When you take a call that it is close to call, that itself is a call that it is close to call.

The big problem in these Bihar elections has been as usual the media’s inability to accept that this is close to call. And them forcing the pollsters to take a call. Because that’s what we are getting paid for. All the pollsters are getting paid to get a call. But I think the change is that now we are also forcing in the ranges which are overlapping. We are forcing them to say personally, even if it has to be on a personal level like me standing in front of the camera, saying, “Listen. This is close.” Statistically speaking, it’s difficult to call. My ego is not getting bruised or hurt by saying that I am unable to call this. No big deal.

In fact, more than Bihar, I would like you to read the one-year report card of the Modi government programme. It was a brilliant programme on India TV and Times Now where it was a Mood of the Nation poll. Probably the first time where I requested [of] them, “Please don’t force me to give you the seat share and vote share.” Because that’s not really the thing you should be analyzing in a weekly report card.

You are aware that we do the weekly tracker, which is the only such vehicle in India. Nobody does that. It was nice to see week-on-week Modi’s popularity going up or going down and analyzing that. It was fascinating to see that right after than 9 lakh suit controversy Modi’s popularity went down. As they say, pollsters are the chroniclers of history. It was important to see the last fifty-two weeks on different scales of popularity, the satisfaction, the issues, which way the country is going. If I would have given the vote and seat share projection, then nobody would have talked about those issues. “Ah! If the election are held today, Modi would have faced problems or Modi would be flipping more [seats]…” That’s trivializing.

So the big change is that I requested them and somehow I got lucky. Both my editors, Mr. Sharma and Arnab, they actually saw the merit in that thing. And they did a brilliant program which was [a] first in Indian television, that you are having a political program on a one-year report card without showing the vote and seat projection. You never have any state of the nation, mood of the nation [program] without vote and seat [projections] in this country. So that is the change which I am trying to push. I am lucky that I got a couple of good friends in media who are trying to see the merit in it. I hope that it continues.

But somewhere we have to put the foot down, that this is what the limitation is. 90% of the research material that goes into serious socioeconomic research and results go unreported in the media. The only 10% of the political aspect of it gets 100% out-of-proportion coverage. That needs to be changed.

And that is what we need to learn from the West. There are polls year round happening day in and day out on thousands of issues which are mapping the health of the society, the socioeconomic patterns, the issues, the trends, which way the society is heading. Not which way the politics is heading. We need to change that.

To change that, a lot of such surveys [need] to be done on a regular and continuous basis. To do that, a lot of funding is required. Not just from the media, but also from the corporate sector. To map those sentiments, to map the socioeconomic trends of the society, that needs to creep in. Somewhere it has to creep in.

On the challenges of polling in India

SS: What are some of the challenges that you’ve encountered during your career in terms of accurately measuring public opinion in India?

YD: That has to do a lot with two components. One is the training of the researchers. Two is the response rate in different sections of the societies.

Training of the researchers, that’s something which is directly related with the first part of my answer. In the absence of regular polls, even if you train a few researchers here and there, you are likely to lose them. Because if I am going back to Bihar after five years, I am unlikely to get back my trained researchers which I trained five years back. And I am unlikely to get the researcher which I just trained now five years from now unless I have a continuous assignment in Bihar and keep on hiring those people and keep on honing their expertise of interviewing and constant improvement. The researchers is a big issue and that primarily because of the lack of regular research in the field.

The second part would be the response rate of the different categories. In India, you know it’s a very heterogeneous society. Largely speaking, all of us have been getting lesser response [rates] from the minorities, from the Dalit communities, from the females. That’s a historic thing.

That is something which cannot change overnight. We tried to come across these things by say, for example, for minority respondents, trying to field same-faith researchers into those localities to conduct the interviews. That helps [in] getting the response rate better. Fielding more female researchers to talk to more females. That helps. But then, fielding the female researchers increases the cost of the operations. It’s a tricky thing to understand that–

SS: How does it increase the cost?

YD: It increases because of the local sensibilities. Say, for example, if I have to conduct research 100 miles from here, or 200 miles from here, or 500 miles from here into some interior village, and I have to send a team of four researchers over there, ideally I should be sending two females and two males who can move together. If the selected respondent is a female, let the female interviewer take the interview. If the selected respondent is a male, let the male interviewer [take the interview]. Because in the rural areas, it is absolutely improbable for an unknown male to conduct an interview of the female of the household. It’s not part of the culture. It’s not done. Anybody who says that it is done does not understand India or South Asia, to be precise. It’s difficult. It gets easier to get the female response if the female interviewers are interviewing them.

But then if you have to send two researchers in the field, the cost goes up. If you are staying there overnight, if there are two males, you can still pick up one room for them to share. But if there is one male and one female, you have to pick up two rooms. If there is no boarding facility in that area, then you have to ensure that they come back to the district headquarters or their main place before it gets too dark. So your fielding operation times get lower because you field for the security of your female colleagues. And for obvious reasons.

All in all, this is a practical problem for all the polling companies. Barring the metros, barring the areas which are urban, barring the areas where they can hire female researchers who can conduct the interviews and come back on their own safe and sound, it is unlikely to send female researchers to the unknown interior areas without knowing the safety and security [is] in place. That’s the cultural context of it.

And when the clients are not willing to pay extra for covering up those additional costs, then the chances are the [researchers] are, to work within the given budget, more likely to send a male-only team into the field. And those male-only teams are likely to return with lesser female sample because of the non-response.

So it’s a vicious circle. One thing adds to the other. But you can always weight [after the fact]. But in the field operations, these are the things which one has to deal with.

SS: What are some of the challenges you’ve faced in terms of effectively communicating poll results to a client, to a media outlet, or to the public?

YD: To the media outlets, as I said, they always need a number, especially the seat share in an election poll or a political poll. Not everybody understands very kindly that the science of surveys stops at the vote share calculation and conversion of votes into seats is not really part of surveys.

Everybody has got their own algorithm or mathematical formula to do that conversion. But that’s not foolproof. That’s not proven. That’s a work in progress always. Even in the oldest of the First World democracies. What we have seen in the UK elections. Even they could not convert it properly. They could not sense that the Tories would be getting a complete majority in the UK. And they have been doing the polls for — I don’t know — seventy or eighty years now.

The media doesn’t understand. We have been saying this. Everybody. Yogendra has been saying this, I have been saying this, everybody who has been appearing on TV has been saying this for so long. But still, the fascination of media with the seat share is amazing. They just refuse to understand. And the most classical [example] is that some of the media guys are so ignorant that when you talk about the margin of error, they put the margin of error on the seat share calculation. So that is the level of understanding the data. How many media companies have a research editor? Barring one or two, I don’t know if the newspapers or TV stations have research editors in them who understand data. They don’t have [them]. That’s a problem. It’s very difficult.

Yes, classically speaking, the normal viewers, the people actually understand it better. It’s very funny. When I interact within the social media with the people, when I tweet, they understand it better than most of the journalists around. So it’s difficult to make them understand.

Somewhere probably we have to draw the line, that okay, you see we can at worst try to come up with the vote share calculation. This is the seat share calculation. But we can’t hone it. It’s not scientifically proven. It’s only an assertion. It’s only a calculation. It’s only our best idea. But it is not the thing. It’s not the thing.

SS: CVoter is an international firm. You mentioned earlier that you’ve done polling in the United States, South Africa, and the United Kingdom. In a previous conversation that you and I had, you said that nowhere is more difficult to survey than here in India. Why is that?

YD: The sheer heterogeneity of this country. If you look into our operation that you have seen already, look into the kind of languages that we cover, the kind of demographics that we cover, you will understand that polling in India week after week is kind of [like] running a Eurobarometer week after week. It’s as simple as that. If you can understand the difficulties in running the Eurobarometer, you can understand the difficulty of running the Indian operations as well.

And then the sheer length and width of this country. The geography. The topography. How difficult it is to reach the interiors. That is why we took a conscious decision to going to CATI on the mobile, because that is something that shows us that we are reaching the remotest of the places in random probability sample. And we are speaking twelve different languages. And we are doing [the] audio recording of each and every interview. And doing the quality auditing of each and every interview. That kind of quality is possible only in CATI. That is why we took a conscious decision that [the] future is that. We cannot leave the field operations to happen in that way, doing only ten or twenty percent of back-checks in the face-to-face surveys. Everybody understands the handicap of doing that. Everybody understands that the where and why it can go wrong.

For example, in the West, the idea of human labor to be calculated in the per-hour approach. It’s a very objective thing. That you work for eight hours and you will be paid on a per-hour [basis], or how many samples you cover. So the per-sample based approach to the researchers, it is so West[ern].

In India, it doesn’t work like that. Or this part of the world, or even in Africa, it doesn’t work for a simple reason. I send you to a certain village. Thou shall go there and thou shall interview people and come back, and per-interview I am going to pay you this much. Now, you in the best possible way, picking up the autorickshaw, going to the bus station, taking the bus to that village, reaching that village, figuring out the random probability sample, and getting to that person, and the person refuses. And then you have to also worry about, “OK, at four o’clock is the last bus. I have to come back.” So what is happening is that you have spent an entire day doing the best possible work and still you might end up [with] two or three interviews. So if it is not economically viable for the researcher, what do you expect the researchers to do? To give you a complete random probability sample? No. Hell no. They are likely to do a cluster sample and come back and report to you as a random probability sample. Because they are getting paid on a per-sample basis.

This is why about twelve years back, we changed — in our organization, in our system — the per-sample system is wrong. Because the researcher is going there, doing the job. He or she may not get a single interview the entire day. And that’s not his or her fault. So the payment has to be on the complete day basis. That’s where we changed it in our organization. So today in CVoter, it’s been almost for the last twelve odd years, we tell the researchers, “You are being sent to this place. Go and do it. At the end of the day, if you do one interview or zero interviews or ten, you are going to get this much of payment which you deserve [for] doing the hard work. Plus, please make sure, even if you are doing one interview, that one interview should be scientifically, properly done and reported.”

But then this methodology or this way of functioning increases the cost of the operation. And when the quality goes up, the cost goes up. And if the clients are unwilling to pay for that cost, that is where the quality degradation happens. Now, understanding this thing is something which is important. The client’s understanding of this is more important for them to know that this is difficult, this is how it has to be done. So all these things, they need to be taken care of. New, better methodologies and technologies are to be adopted to improvise this further. And the clients need to understand this as well: what are the limitations of it, how it is to be done, and how the best could be achieved from this.

SS: What are some of the tradeoffs that you have to consider when you’re deciding whether to field a survey in person, over the phone, or over the Internet?

YD: Well, Internet is a no-go. First of all, it’s not really representative. For a simple reason that only 5% of the people are actually active on the Internet. The 15% that is floating around is for the people who have the data pack or the data plan, but not everybody who has the data plan is active on the Internet. So first of all, it’s a skewed number. It’s not representative.

The second thing is that even in the West the online opt-in is scientifically not a representative sample. That’s a fact. The so-called river sample is also again not a random probability sample. I mean, it’s a random probability sample among those who are visiting that part and willing to comment on it. So even a river sample is not exactly a simple random probability sample. In India, you cannot even think of doing that simply because number one, it’s not representative, and number two, the incentives with opt-in panels are further going to degenerate in that direction. So as of the current moment, the online thing is not done.

We do a random online thing, when we are doing our CATI every week, we also keep on asking in the final question, “Do you use Internet?” and “Would you like to be part of our panel?” So that is something which is [a] much more randomly recruited panel. But even in that random[ly] recruited panel when we have tried to do the surveys, one reminder, two reminder, three reminders, people are not very keen to click and answer. And probably the third round when we call them from the CATI center, the more likely answer is, “Why are you pushing me to click and answer? Why don’t you simply ask me? I will answer you right now.” So it becomes more like — instead of online — it becomes WAPI, web-assisted personal interview. That means you are talking to them and you are punching the data on their behalf. The only good thing is that some clients understand that and they agree to it whenever they commission it, because we have the audio recordings as the proof of the interview. But still a long way to go on the Internet.

CATI is the future right now because Indians may not read and write but they can certainly talk. This is not my saying. This was something which was said by Dhirubhai Ambani when he was launching his mobile services in India. [In] India, a big number of people are not very educated. A big number are illiterates, to be precise. But they will certainly talk on the phone. Everybody has a mobile these days. So that helps. So CATI is the future, as far as I’m concerned.

SS: Have you found that certain populations — because of cultural factors, because of socioeconomic factors — are more or less likely to be selected and participate in a survey? You discussed this a little earlier.

YD: Yes. The more educated, the more well-to-do, urban, affluent, upper caste, males are more likely to respond. The less educated, females, minorities, lower castes–

SS: Religious minorities?

YD: When I say minorities, I am talking about Muslims especially.

SS: What about Sikhs or–

YD: Sikhs, I never had [problems]. Because Sikhs are, from the class distribution perspective, Sikhs are one of the wealthiest communities in India. Jains are one of the wealthiest communities in India.

So more educated, well off, urban. The response rate of Sikhs is fabulous. [The response rate of] Jains is fabulous. The response rate of Christians is perfectly fine. The response rate of Christians even in the tribals is perfectly fine because in all likelihood they get better education because of the Christian missionaries running the schools. They get better education, so their response rate is better in that way.

But the females at large — regardless of caste, creed, community, or whatever it is, religion — females at large the response rate is lower. The Muslims, the response rate is lower. Unless you are sending the same-faith research into their community to ask the questions.

In certain areas, the Dalit response rate is lower. When I say certain areas, it is directly proportional to how politically empowered the Dalits are in those areas. Before Mayawati happened, the Dalit response rate in UP was very low. But not anymore, because now Dalits in UP are now wearing their identities on their shoulders. So they are not really shy to say. Before Lalu happened in Bihar, the Dalit response rate was low. After Lalu happened in Bihar — one may call his rule as a misrule, or whatever — I have no problem in saying that Lalu Yadav was a game changer in Bihar for that downtrodden, that oppressed Dalit and other communities in Bihar to come up in their social aspirations and their sense of being powerful. So I don’t get a poor response from the Dalits in Bihar. It doesn’t happen anymore.

Yes, in Rajasthan, I still get poor response [rates from Dalits]. In Punjab, I get poor response rates from the Dalits, including the Dalit Sikhs. In Madhya Pradesh, I get poor response from Dalits. So the states where the Dalit political identity is still weak, the political identity being weak is directly proportional to their response rate.

I don’t know how to phrase it correctly. I may sound very unscientific, but I am talking only from my personal experience. The stronger the political identity, the better the response rate of the community is supposed to be.

SS: Do you see this with Muslims as well across states?

YD: Yes. It is there. But with the Muslims, it’s not that they don’t wish to talk. It’s not that. Whenever we send the same-faith researchers, the response rate dramatically increases. They don’t say much, their response rate is down, when there is a sense of polarization. Then they go quiet. But in a normal situation, they speak well. They speak as good as Hindus actually when there is no fear of polarization around.

But yes, it is next to difficult to interview the Muslim females. It is almost impossible because of the cultural context. We must respect and understand that. And it’s not just related to India. I have worked in Indonesia. I have worked in other South Asian countries. I am right now working in the Middle East, bigtime. And I know it’s impossible for a Muslim female to talk to a male stranger. It’s the cultural context of the situation. The same is applicable. And then what happens is that in the puritan way, as many of the methodologists call it, even if she agrees to be interviewed, the chances are that she has to be interviewed in the presence of male members of the family. And when you are asking about the questions that are uncomfortable, can you imagine a Muslim female being interviewed on the uses of contraceptives by a male stranger in the presence of her husband? Can you imagine that? And can you imagine that the answer flowed freely without interruptions or without any things from her husband?

And that applies equally well on the Hindu females. Can you imagine a Hindu stranger talking to a Hindu female in a household about the contraceptives she is using in the presence of her husband or father? I’m sorry, if somebody says yes, he is lying. I’m sorry to be so harsh about using the word “lying” because it’s impossible in this cultural context. There are cultural sensibilities which are to be taken care of, which are to be honored.

And when you honor those cultural sensibilities, that doesn’t mean you are compromising the quality of the data. It only means you are trying to figure a way out on how to get the best possible data. So the answer is that if you have something sensitive like uses of contraceptives, it is always better to send the same-faith female researcher who can interview alone and get the data.

And the same thing happens with the hard-to-reach populations. We did a fabulous survey when we were doing the census for sex workers in one of the districts in Bihar. It was wonderful research. How do you expect a sex worker, a victim of human trafficking, to reveal their socioeconomic problems to a complete stranger? What are the chances like? We know how difficult it is to interview the hard-to-reach populations in stressed conditions. That’s why we did a project over there where we actually trained a big number of children of sex workers who are minimum high school graduates. And we trained them as a researcher to go and conduct the interviews in their localities. Now they knew the localities. They were talking to their moms, their sisters, their neighbors. And they were more likely to get the data correctly. So we named that project as Project VASE, Victims As Social Evaluators. And it worked fabulously.

Even when we were doing our post-tsunami research in Aceh. Because in Aceh, there were conflict victims as well as tsunami victims. It was almost impossible for the normal researchers to go into the interiors, to visit those areas, and talk to the female multi-victims, as we say, those who were the tsunami victims as well as the conflict victims. There we trained a special team of ex-guerrila fighters of GAM [Free Aceh Movement] who had laid down their arms and surrendered and were looking for better jobs. We trained an entire team of them to conduct research on sexual harassment and other cases and other difficulties. Fabulous set of data. Because first of all, they were themselves the victims. They understood the sensitivity of the subject. And they were going to interview other victims. And they were coming up with data which was kind of unheard of. When you get an interview from somewhere in the interior of Aceh, where in the end of the form, the researcher has written that this lady was raped three times by the security forces. Now that’s an insight which no one can ever provide you. That’s an improvisation.

So Project VASE has been something which is very close to my heart. I have been working on that. I used that in Bihar for the sex workers survey. I used that methodology in Aceh. In Sri Lanka, in the northeast of Sri Lanka. It’s amazing. It’s amazing. That is something which I am going to focus more and more on, on how to get better responses from hard-to-reach populations, in special conditions. These are special populations.

But in routine conditions, yes, my solution in India is very simple and straightforward. Sending more female researchers gives you a better female response. Sending more same-faith researchers gives you a better response from Muslims. Sending more same-community researchers gives you a better response from the Dalits.

SS: Going back a bit to your discussion of Muslims, you said that at times when there’s a sense of polarization, the response rate decreases among Muslim respondents. Could you provide some examples of that?

YD: In Gujarat, we have always been getting lesser response post-Godhra [riots in 2002]. In fact, every time there is a Gujarat election we do a special sample in Godhra. That is something which is part of our study also.

Whenever there is a communal riot or something like that, [in the] post-riot scenario, the response rate of minorities actually goes down. Especially in the areas where they are even less spread, but really lesser in number. Probably the fear cycle just takes over. They don’t really wish to talk and be seen as something different. They want to assimilate and be safe. Their response rate goes better only when you are entering a completely ghettoized area where they are kind of feeling safer and the same-faith researchers are going. Then it goes up.

In a normal scenario, they respond as normally as the Hindu or other communities. This happens only when the political surveys are happening and if the atmosphere is charged on communal lines. Then only this thing happens. Otherwise, normally we get the normal response rates.

SS: Did you see that in UP after the Muzaffarnagar riots?

YD: Yes, that happened. That happened in UP after Muzaffarnagar.

SS: The last group I want to check with you is Scheduled Tribes. Do you see a different response rate among Scheduled Tribes?

YD: Yes, yes, yes. It is lesser among the Scheduled Tribes as well. But that lesser response rate is not because of something like they don’t wish to talk. Scheduled Tribes, there is no fear. They wish to talk. They can talk to you easily.

It depends on what exactly are you asking them. So if you are asking about something like– and actually, funnily enough, we do questionnaires like that from clients. “What is your view on Russian President Vladimir Putin? Are you favorable or unfavorable?” Try talking that to a tribal  in Gumla district of Jharkhand. I mean, come on! Come on. I sometimes really wish to ask those who frame those questions sitting in some plush university campus in America or Europe, “Man, go out of your office. Try interviewing ten people you come across outside your office and ask them how many of them know the name of Mr. Putin over there. And you are expecting that a person in a remote Gumla tribal village is going to answer that?” I mean, it is more contextual, more on the subject. So the issue of response among the tribals is not because they don’t answer. You talk about their problems, their day-to-day lives, the issues around them, they are more than happy to answer. It’s just that they are more likely to say in that kind of questionnaire, 99% of the questions will be, “Don’t Know / Can’t Say.” And when you say, “Don’t Know / Can’t Say”, how can we accept this interview where everything is “Don’t Know / Can’t Say?” “Don’t Know / Can’t Say” is information as well. They just don’t know Putin! What is the big deal? Hell. They might even know about Putin but they certainly won’t know anything about the head of the state of Brazil for sure.

How many in this room…? I mean, we are sitting in New Delhi. You go and talk to that table, “What do you think, favorable or unfavorable, of Dilma [Brazilian President Dilma Rousseff]?” The next thing they are going to say, “Ah, Dilmah. They make good tea in Sri Lanka.” (laughs) What are we talking about?

So the issue of non-response also kind of has to be seen in the right perspective. The issue of non-response or lesser response has directly to do with what exactly you are trying to ask.

SS: Do you see big differences in response rates across states?

YD: Yes. There is. There is. The more politically charged-up states actually get a better response rate. The more educated states also give better response rates. The response rate goes down typically in urban, affluent locations where the people just don’t– They are educated. They are upwardly mobile. They are wealthy. They know. But the response rate is going down. They are least interested in talking to you. So that’s a different kind of problem of response. That’s the other end of the socioeconomic spectrum. The affluent class, the upper middle classes, their response rate is going down because they don’t wish to answer. They don’t have the time. They are not interested. They are just too happy counting their money and living their life.

SS: When you said the more politically charged states, which states do you mean?

YD: Bihar, Tamil Nadu, UP, parts of Maharashtra. We are likely to get a good response rate.

People are very happy to talk. Indians are very talkative. They love to talk to you on any issue under the sun. And it’s not that they don’t wish to answer. The problem is that on CATI, for example, for us a ten-minute interview on CATI usually ends up being fifteen or sixteen minutes because people keep on talking. We don’t know how to cut them [off] because it is rude to cut them [off]. We cannot cut them [off]. “Sir, please don’t explain your answer. Just tell it to me yes or not.” No. They don’t tell you as yes or no every time. They give the context and they talk.

Probably they talk to us more because of our media identity. That is something I am agreeing to concede. Our branding is primarily of a media entity. People have seen so much of our name and face on TV and the newspapers that when we say we are calling from CVoter, we are doing a survey for India TV or Times Now, probably coming from the media background opens them up immediately. We also designed the questionnaires in a way which starts with the problems of their life, to open it up. The moment you talk to them about their problems, their lives, they are more likely to open up and talk to you. And then they talk like hell. They keep on talking.

SS: Have you found for election surveys that some states are easier to poll? That you have a better track record with some states than others?

YD: No, nothing of that sort. I believe that from the election polling perspective, the only easier polls are those when the anti-incumbency issue is huge and the feedback from the people are so crisp and clear that there is no confusion whatsoever in your mind that, okay, this government is going to lose. They are the easiest ones. Otherwise, I would not say any particular state is difficult or easier to poll. Every state is similarly as difficult and as easy to poll, but depending on that particular election, whether or not there is a clear cut winner or clear cut loser.

On transparency of pollsters, disclosure of methodology, and the forthcoming Indian Polling Council

SS: A number of the people I’ve spoken to, including researchers — and this is again something you were saying earlier — have critiqued the lack of transparency among pollsters in India when it comes to reporting methodology. What do you think can be done to make pollsters more transparent about how they’re doing their surveys?

YD: We have already taken the first step. You might be aware of this. This we have been trying for a really long time, and now there is a group formally in place called the Indian Polling Council, to start with the colleagues from [names of polling companies have been removed]. They have pitched in together. The idea is basically to make everyone accountable. The idea is to make this entire process more open.

We have been trying. Our office and our working posture has been always open. You have visited yourself. You can talk to any colleague in the industry that those who are confident in their work are always open. When I say open, that means even physically opening up the facilities, to visit the office, see the data, see the process, how it is being done. It is as open as that. So those who are trying to do that in media, especially the ones who have been doing it for a longer time, they really do feel the importance of making it more accessible, more transparent so that the public at large, the media at large gets educated and they can differentiate a good apple from a bad apple. Purely on the merit of the data. Institutionally speaking, it is an important thing to do. So that’s why we have formed that.

The biggest differentiator of those who would be in this organization or this setup and those who would be outside this gambit is the question of accessibility. Anybody who is in this group is committing and opening up for peer review of their data if there is any controversy around. If I am opening up, “Okay, you are happy. Please come and visit my data. See the data. Analyze it. Figure it out. If I got it right, why? If I got it wrong, why?” Whatever it is. If I am opening it up, I am pledging myself to do that, then I am transparent enough for my industry and my peer review. If somebody is not agreeing [to] that and remaining opaque, their problem. But at least people would know who is part of the Indian Polling Council and who is not. Who is opaque and who is transparent.

Somewhere we have to draw the line. And I believe the responsible ones have started realizing that. It took a little bit long. I believe that this should have been done five years back. But better [to] be late than never. We are starting it now, and we hope that this will help us in the long run.

SS: Would the Indian Polling Council determine– members would have to disclose certain things about their methodology, their sampling, their margin of error? I haven’t seen margin of error in almost any of the polls I’ve seen reported here.

YD: Yes. Pretty much. We are going to make a format which will be required to be filled up by all the pollsters every time they release a poll. That standard form would be uploaded on the website. So for any research the disclosures are complete and full. That standard format is applicable to anybody who is willing to offer. And this has been asked by the Press Council of India, that it should be done. This has been advised by the Indian Law Commission, this should be done. And they actually advised that this should be done in consultation, something like the British Polling Council.

I wrote up to Nick Moon, who is the Secretary of the British Polling Council. Nick Moon was running NOP. They immediately agreed to help in every possible way, that we can pick up all those standard protocols, SOPs, from their things and apply. So I wrote to Kathy Frankovich and she happily said that she would be helping in every possible way. I am also a member of WAPOR. So what I am trying to say is that those things which have been done and tested over there, we just have to try and apply everything.

If anybody has a problem with that kind of transparency, it’s for that organization or that person to explain why they are uncomfortable. But if you are to be part of this particular setup, you need to be transparent. You need to be open for a peer review. That’s the bottom line. That if there is a question mark, then you shall be sending the entire dataset for the peer to see through and understand what the problem’s like and help you with the answers. It’s about the intent. Transparency is basically a question of intent. Whether you are opened up or not, whether you have the intent to let the peers review, that’s transparency. Those who are not, it will automatically come out that they are not a part of this group. Because they refuse to follow these procedures. They refuse to follow these standard operating protocols. They refuse to disclose this part of the methodology. They refuse to disclose the minimum amount of information that should be released with every poll. So we are forcing now these guidelines.

This should have been done many years back. I understand that. But, I don’t know why, the seniors at that point of time did not do it. I don’t know why and I am not questioning them. Maybe these were not the important things at that point in time because there were only a handful of players and even then we were in good communication with each other, and offering help and solutions to each other. It was a good, cozy club. We used to talk to each other and give advice to each other. If I had a problem, I could have picked up the phone and [talked to] Mr. Yogendra Yadav and he had been kind enough to suggest [to] me the solution at that point in time.

But now we have moved because all of a sudden we are looking at the industry growing. It’s good that the industry is growing. If it grows, it’s good. But then the growth has to come with the discipline of the quality consciousness and transparency.

SS: How soon will the Indian Polling Council be a more formal organization? Will we see this in place by the time of next May’s elections?

YD: Right before that.

It’s already in place, first of all. We are in regular communication as a group. It’s already in place. We are talking to each other. We are helping each other. It’s just probably I believe that we would like to do it this calendar year, that everything [is] formally in place. Most probably, if all goes well, we would like the general secretariat to be at CSDS for a simple reason that number one, they are the oldest of the bodies who have been doing that, and number two, they are [a] noncommercial venture. They are academic in nature. So they are a nice [place] to be stationed. I hope that Sanjay [Kumar, Director of CSDS] agrees to that. We are likely to have our very first meeting very soon.

SS: Thank you very much.

Rajeeva Karandikar explains how he converts votes into seats


One of the greatest challenges in Indian psephology is the projection of seat counts for Vidhan Sabha and Lok Sabha elections. While survey data can reveal how much vote share each party will win with a high degree of accuracy, they do not show how that vote is distributed across a state. Projecting that a party is to win a certain percent of votes does not say how many seats it is expected to win in an assembly, and therefore whether that party will perform strongly enough to form a government. In cases where the vote share of one party or alliance is much larger than all others, predicting the winner is fairly simple. In cases where the vote share of different parties or alliances is quite close — as is anticipated by nearly all polls for the Bihar elections — the vote-to-seat conversion model is key to predicting the winner.

Because this is such a crucial and challenging step to the projection of election results, pollsters are especially reluctant to share how they do their vote-to-seat conversion. The exception is Professor Rajeeva Karandikar, who for a long time did the seat share projections for the Lokniti unit of the Centre for the Study of Developing Societies (CSDS). Professor Karandikar is very open about the principles undergirding his vote-to-seat conversion models. Last year, he wrote a piece for the Hindu Centre for Politics and Public Policy outlining CSDS’ sampling methodology and detailing the principles of the model he uses to convert votes into seats.

Rajeeva Karandikar is the Director of Chennai Mathematical Institute in Chennai. He obtained his PhD from the Indian Statistical Institute (ISI) in 1981, and was on the faculty at ISI till 2006. He had a brief foray into the commercial world at Cranes Software as Executive Vice President from 2006-2010. Karandikar has made significant contributions to various areas including stochastic calculus, filtering theory, limit theorems, Monte Carlo techniques, and the theory of option pricing. He is a fellow of the Indian National Science Academy and Indian Academy of Sciences, and was awarded the S S Bhatnagar prize by the Council of Industrial and Scientific Research in 1999. Karandikar has been involved with opinion polls and exit polls in India since 1998. He worked with India Today, Doordarshan, and TV Today (Aaj Tak) from 1998-2005, and with CNN-IBN from 2005-2014.

On Tuesday morning, I met with Professor Karandikar at the Indian National Science Academy in Delhi, where he was visiting for the day. Sitting in his hotel room, he explained to me how he found himself working in psephology, painstakingly went through each step of his vote-to-seat conversion model, and considered what might be required to ensure higher standards of methodology disclosure. The transcript of the conversation below has been lightly edited for length and clarity.

On his involvement in public opinion research

Sam Solomon: You are a mathematician by training. Could you please tell me about your background related to public opinion research and how you got into psephology?

Rajeeva Karandikar: I’m a mathematician/statistician and keen observer of politics, though I have nothing to do with it. Fifteen years after my PhD, I was on the faculty here in Delhi at Indian Statistical Institute, and I used to read whatever little was written in the media about the methodology of opinion polls.That may have not been whole truth but whatever little blurbs were coming in, I used to be fairly critical of that. Especially about sampling methodology.

We had colleagues who were economists. One of my colleagues, Bhaskar Dutta, had some contacts at CSDS. He had come to CSDS for some consultation. And that’s where Yogendra [Yadav] asked him, “We are now going to do a big national poll and we want a proper statistician to advise us on statistics.”

SS: This is when he was launching Lokniti?

RK: Yes, in ‘97. So Dutta came and asked me, “Would [you] be willing?” And I said, “Definitely.” That’s how he got me in touch with Yogendra.

At that point, polls were not [that common].

SS: They did them in the 60s and the 70s, right?

RK: Yeah. In the 60s and the 70s, they did them on a small scale because of money. They developed a robust methodology, questionnaire design, all these things they developed. But ‘97, they were going national. Or maybe they had done it once national, but now it was going to be sustained national polls coming in.

You know, questions like, do we go to the same people who we interviewed last time or do we pick fresh sample? What are the pros and cons? So Yogendra had such questions. And he is a fantastic applied statistician. Apart from being a political scientist, he has a great sense of applied statistics. But he wanted to know theoretically how sound it is and so on. So that’s how I got roped in.

And then within a month of our meeting the parliament got dissolved and the polls were announced for three months later. So we got frantically working for a national poll. That was ‘98. That is where it began.

And in that I got involved so intricately that from then on, it’s almost become my second profession. Almost.

SS: So when you described this in the Hindu Centre article, you had these extensive conversations with Yogendra Yadav but also with a British professor, Clive Payne, about how to do this research and how to convert vote shares into seat shares. You say in the article that you thought UK-style models would not work with the Indian political system. I’d like to know–

RK: A little more about that.

SS: Yes. The unique features of the Indian political system.

RK: In any such applied work, whenever it’s involving sampling, let’s say, the first and foremost important thing is what kind of data is available. By that, I mean the sampling frame, that is, the list of potential respondents. The population. How is it listed? That is number one.

Number two, what characteristics of these respondents are available to you? For example, do you know their age? Do you know their sex? Do you know their economic condition? Do you know their education? Do you know their other social background? In a list. Or, if not at the individual level, at what level are these characteristics available?

This is one of the big differences between the UK and India. In the UK, [for] precincts, which are roughly like polling booths or clusters of polling booths, the socioeconomic profile is available. Not for individual persons, but at least at the precinct level.

SS: By socioeconomic profile, what specifically do you mean?

RK: Education, economic condition. In the Indian context, it could be religious breakup. It could be caste breakup. It’s not all, but at least some of the socioeconomic variables — could be education level, could be income level — are available at the precinct level. That information could be used in modeling the final outcome, and when you can do that then that can also decide your sampling. So this whole thing has to be done together.

In India, that is not available. The census data is there, but census data is aggregated and published at the level of a district. So a lone booth or cluster of booths, even a constituency’s socioeconomic profile is not available because there is no unique identifier between districts and constituencies. A constituency can have parts of different districts and a district can be distributed across several constituencies. This is a reality one has to take into account for modeling purposes. So this is one thing which they use strongly in UK models that Clive Payne uses which we do not have here.

The other important difference was what I call volatility of opinion: what percentage of people change their opinion from one election to the next, say Lok Sabha to Lok Sabha, Vidhan Sabha to Vidhan Sabha. In the UK, talking to Clive Payne, at least at that point in time — it may have changed now [since] this conversation was seventeen years [ago] — the perception was that in the UK the volatility is rather low. Across large sections, all their lives they continue voting Labour or Conservative. Whereas in India, it’s highly volatile. Extremely volatile. As you can see from successive percentages, if you try to look at any kind of data historically, you can see wild fluctuations at the aggregate levels. So now you can imagine [the fluctuation] at the individual level.

At one point, that is ‘98, with that Lok Sabha poll, we had the luxury of doing the following: We did a pre-poll for the whole country just before the first phase, published the outcome. Then CSDS had something for a television program, a counting day program, where Yogendra’s idea was, and we did it rather successfully–we researched, we worked out the model, and the idea was his, but we worked it out: In those days, the counting used to take three days or two days. And the idea was that at any given time, looking at the counting data at that point, [to] make a prediction of the final parliament. So you take the opinion poll and you take the counting data. So for this purpose, Yogendra got money to do a post-poll. Though we were not going to go on air with our findings of the post-poll, we got money to do the post-poll, which was going to be used for this part. This part worked out beautifully. Anyway, that has lost meaning with the electronic voting counting and [after] four hours everything is over. So our ESP, as we called it, Early Seat Prediction or ESP, that has lost its sheen. But at [that] point we did have it.

Anyway, the point is we got the money to do the pre-poll and the post-poll and we went to the same respondents. What we found was that 30 percent of people had changed their vote. And the timeline was, one third of the country eight days, another one third sixteen days, another one third maybe twenty-six days. So on average a little over two weeks and 30 percent of people have changed their minds. What that shows is actually opinions get formed at the fag end just as voting day comes near.

So I have grave question marks on the predictive power of any opinion poll done by whatever, [even] the best of methodolog[ies]. In fact, for several years CSDS and me, that is, we were not doing any pre-election poll.

SS: This was the first year, with Bihar, that they got back into pre-polls.

RK: In the assembly. For the Lok Sabha, we were resisting, but we were told that if you don’t do it, we have to go to someone else. The channel cannot just stay out of that race. So we did that.

Predictive power is rather poor because the volatility is too huge. Some small thing, some statement by someone changes public opinion and everybody turns around and changes their mind. In fact, at a few points, in various times in CSDS polls, we started putting questions like, “Have you decided whom you are going to vote for?” Then, even if they said no, then our follow-up question is, “If the polls were tomorrow, whom are you going to vote for?” Again, 25-35% of people say they have not made up their mind in the pre-poll data.

SS: Did you find that the people who made up their minds later fit a certain demographic profile?

RK: We tried this, but not really. India is not really one homogeneous country. But as soon as you drill down to a state or one section then your sample size is too little to make any such inferential correlations. And it changes from one election to the next. Perhaps the more educated ones are less likely to change.

SS: Is that your own personal experience? Your friends, your family, do they switch their votes pretty regularly from one party to another?

RK: Not as much as we found in the surveys. My friends, my family–

SS: And yourself?

RK: Myself. We make up our minds fairly early. But this fraction is low.

With Bihar — now I have not looked at any post-poll data, I don’t know if you have — but without the post-poll data, I’m not putting too much weight on the pre-poll findings. At best, my statement is that while going into the polls they were neck-to-neck, both sides. And later on we are hearing all kinds of buzz by the media, most of it the English-language media, who are painting one kind of picture. Maybe that is the correct one. Maybe they have feedback from the post-polls or exit polls. I don’t know.

SS: I guess we’ll see.

RK: We’ll see.

On vote-to-seat conversion

SS: Based on your article, a key assumption of your model is that the change in percentage of votes for a given party from the previous election to the present is constant across a given state.

RK: Or some region in the state.

SS: Or some region in the state. This is actually different from what I’ve heard from some pollsters about elections in India and they have said that there isn’t really such a thing as uniform swing.

RK: It is true that any kind of statistical analysis post facto, or back testing as it is called, if we do this model it fails miserably. But this model is the basis which allows me to mathematically come up with a methodology, as opposed to just ad hoc. And this has proved to be reasonably good and no one has been able to show me a methodology which is better, other than their gut feeling.

SS: And a big part of it is that you’re predicting the aggregate number of seats, not particularly for any individual constituency.

RK: Absolutely not. Absolutely not for the constituency. In fact, when we do parliamentary polls I used to be extremely reluctant to even give [projections] at the state level. I would say that at best we will give the four regions: north, south, east, west. Not even at the state level.

SS: So you notice state swings but also regional swings within states. How do you group the regions of a state then?

RK: Grouping of regions is done politically by CSDS. So they’ll say, “Socioeconomically, this is [region] 1.” They used to do the splitting of states. It was generally geographic, but still how would they draw the boundaries? This is where their understanding of the socio-economic conditions across the country played a crucial role.

Let’s take Maharashtra. Four regions. Greater Bombay is one region. Now suppose we measure that the overall swing for BJP statewide is 5%, and in Bombay region it is 3%. We’ll assign some way, either half-half, or one thirds-two thirds, so I’ll take half of the state swing plus half of the region swing; that would be my swing for every seat in Bombay region.

Now these regions can also be extended to what we may call phases. Now that elections are going on for a month and what happened first phase and what happened in the fifth phase, lots of things have changed. So I’ll take one third of the state swing, one third of the phase swing, and one third of the region swing, let’s say.

SS: The phase swing, what would that come from?

RK: The phase swing would come from last election in this phase what was the percentage of votes, and in my opinion poll [for this election] what was the percentage of votes. So the difference will give you the phase swing.

SS: And you look at the phase across the entire state?

RK: Across the entire state.

SS: And so you take the last election, 2009, or 2014 and you’d say–

RK: No, no, no. So in Bihar, there are lots of interesting points which will emerge. Primarily, we take not 2014 but the 2010 Vidhan Sabha poll.

SS: For Bihar 2015, you take the 2010 Vidhan Sabha data.

RK: However, from 2010 to now, lots of things have happened. The alliance structure has changed dramatically. Somehow we have to take that into account and come up with what we call a simulated initial file. If nothing had changed, the 2010 [file] would have been the initial file on which we look at the swings. But now what we do is — this is ad hoc — but firstly, wherever BJP contested on behalf of the [former JD(U)-BJP alliance that governed Bihar from 2005 to 2013], I would split it, let’s say, 60/40. And if it is a JD(U) candidate, I will give it 60/40 [favoring] the JD(U).

SS: And how do you come up with those figures? You just have to make some sort of subjective judgment.

RK: Yeah. Some sort of subjective judgment. And it used to be easier when Yogendra was a partner because then he would have a very good sense of these things.

SS: Just from his own experience.

RK: Yeah. And over the years I think I have developed a keen [sense]. So now I don’t fret over it. I do it and proceed.

The other thing is it is very robust because it is only used to do the initial file. 60/40 or 55/45 or even 65/35 is not going to make a huge change in the final analysis for the total prediction at the state level. I’ll experiment with this. Of course if I do 10/90 or 90/10 it will not be [similar], but in a certain range it really doesn’t change dramatically.

SS: It doesn’t make a huge change at the aggregate level. Maybe at the individual constituency [level].

RK: Aggregate level.

So likewise, when two groups align, then we don’t simply add their votes, but there is going to be some friction. So we might take 85% or 90% or something like that.

Or if it were UP [Uttar Pradesh], let’s say, and in the previous election Mayawati and Congress fought separately, and let’s say in this election they are fighting together, Yogendra would say that, “Whatever were Mayawati’s votes, even if there was a Congress candidate, it will transfer 100%. But whatever Congress votes, if there was a Mayawati candidate, it will only transfer 50%.”

SS: And that’s just knowledge he has from being in Indian politics for a very long time.

RK: Yes. So we take into account whatever we can to bring in a subjective judgment about, “If this were the alliance picture [in the previous election], what would the vote have been the last time?”

Now, in Bihar, there is another challenge. We see that there was a Lok Sabha election which changed the realities tremendously. So for Bihar what I had been doing was, I took the file from 2010, applied the 60/40 [vote split between the JD(U) and BJP], and then I took the [2014] Lok Sabha [file], and I took an average of these two.

SS: You took an average of the [re-weighted] 2010 Vidhan Sabha file and the 2014 Lok Sabha file as your base [file].

RK: Under my model, the level of support is determined by the opinion poll. I am not applying any subjectivity there. The whole thing is how is that vote distributed across the state? If you say the BJP is getting 41% in Bihar, how was it distributed across the constituencies? So my base file’s only role is to make an assessment about how it is distributed.

SS: How it’s distributed within each constituency.

RK: Within 243 constituencies [for Bihar]. Where is it that they are polling more than the average? Where are they lower than the average? Things like that.

SS: And then you’re going to take the uniform swing — or the uniform swing plus the regional swing plus the phase swing — to calculate what you think would happen in each constituency?

RK: Once again, the base file and the model together tell me how this 41% of votes is distributed across the constituencies. That model, I am not using it to say it is not 41%, it is 44%, no. The level is pegged by the observed data. The model and the base file altogether only help me in attributing that across constituencies. For the BJP as well as for others.

SS: At the aggregate level and not in individual constituencies.

RK: No, no. I do it at each constituency, I aggregate assigned votes. Then I would say who is winning where and I will add it up to get my final number.

SS: But the actual constituency level projections are not– the model doesn’t do well at the micro-level.

RK: No, no, absolutely not. Absolutely not.

SS: [So] in the aggregate the errors cancel each other out. Or at least, that’s the theory.

RK: That’s the theory. And at least I have observed [it] in practice reasonably well. If our estimates are good, this is good.

SS: I was hoping that we could — we kind of did it already — but we could walk through each stage of the process, slowly and clearly, just to explain the vote share to seat share conversion. So I have it all on [the] recording.

RK: Yeah.

SS: So first, you do the survey to determine the uniform swing across the state. You also have a  regional swing and a phase-based swing. And then you’re using that to calculate projected vote share in each constituency.

RK: For each party. For each of the major parties.

SS: So, for example, if the BJP is projected to win 5% more votes statewide in Bihar, and they won 36% of Constituency X in 2010. Then it becomes 41% in 2015. You just use Vidhan Sabha figures to build the model, but you just said that–

RK: In special cases like Bihar this time.

SS: And the reason for that is because the 2014 Lok Sabha was such a huge–

RK: Tectonic change. It’s a big change.

So that is why while there is one broad model, there is no relevant software where you just plug in data. Each time we may have to come up with some tweak, like this one in Bihar.

SS: So once you’ve got the projected vote shares in each constituency, you calculate the probability of victory for each of the parties for each constituency. Do you do this by doing a multiple proportion z-test for each constituency?

RK: Something like that, something like that. And there is again a subjective parameter, the sigma, or the standard error for the vote estimate.

SS: That’s subjective. You’re using the standard error for the entire sample but you’re using it for just a constituency.

RK: So therefore we put a factor of safety. So whatever is the sigma, I inflate it by 2 or 3 or something like that.

SS: You’ll inflate the standard error to allow for more uncertainty in the model. Okay.

But you only do this for the top three parties. Why do you stop at three? It seems like there might be some cases in India where more than three could be competitive in a constituency.

RK: The first time we did it, we did it with only two [parties]. And then we found that there were [many] places where the party [that was in third place] ended up winning. So we extended it to three. Now, rarely is someone who is fourth in my estimate actually winning. It’s very rare.

It can be done. It will make matters a lot more complex.

SS: And it doesn’t add a lot to the model.

RK: It doesn’t add a lot to the model. That’s the point.

To come up with these conversion tables and so on, all we had to was do it with the bivariate normal distribution. If I have to go to four, I will have to go to three-dimensional Gaussian, and that can be done. If it was necessary, it can be done. But it simply was not worth the effort.

SS: It sounds complicated.

And then you say it’s as simple as summing the probabilities of each constituency across all the constituencies. But wouldn’t you want to run the model many times and see what kinds of distributions you get?

RK: Of course. Of course. That I do.

So, for example, I can play with the different weights for state/region/phase [swings] and so on and so forth. I can play with different weights there. Then I can vary the sigma parameter, the inflating factor. I’ll run it several times and then get a sense and try to position myself somewhere in the center, normally.

Whatever are my subjective parameters, I vary those. I don’t vary the vote percentage. I vary the subjective parameters.

SS: Does that include the base file? For example, if there’s an alliance–

RK: To begin with, I might have made two or three base files. I don’t vary them tremendously, but I might have made two or three base files.

SS: If there is some unusual alliance, as there is in Bihar.

RK: For Bihar, what I have is one pure 2010 base file, one pure [2014] Lok Sabha base file. That I will not use, but I will [weight them against each other by] one third-two third, half-half. So this is the variation I might try.

SS: With each of those, you do different projections based on changing state swing, or regional swing?

RK: No, weight for the regional swing.

SS: Ah, how much you balance the regional swing and the state swing.

RK: The main point is none of my things make a modification to the opinion poll data. It is just interpreting that vis-a-vis seats which is where I play with my subjective model and parameters.

SS: And again those would include how you’re weighting Vidhan Sabha versus Lok Sabha data for the base. It would include the different vote shares that the alliances are picking up. And then also you said the sigma, so how much uncertainty you add to the model for each constitutency. Any other subjective parameters that I’m missing there?

RK: Not really [editor’s note: we both forgot to mention here how the state, regional, and phase swing are weighted against each other].

SS: How does this work for Lok Sabha elections? You say in the article that there’s uniform swing within states, within regions of seats [editor’s note: this is incorrect. He uses uniform swing and regional swing as the basis of making projections but does not describe the seats as uniformly swinging one way or another], but not across the country. Do you treat each state as its own election?

RK: Absolutely. Absolutely. Because that’s how India is, at least electorally. [There are] districts across Tamil Nadu and Karnataka boundaries, or districts across UP and Bihar boundaries. Socioeconomically they may be very similar but political behavior is very different. That’s why there is no nationwide swing or nationwide effect. So I treat each state as separate.

SS: Those subjective factors that you listed in the model, [those] can account for a lot of [variance] in seat projections, right? For a lot of the polls for Bihar, a lot of the vote share projections have been pretty similar but the seat share projections can be pretty different.

RK: See, but nobody else even says how they are doing the seat projections.

SS: So you have no idea how the other–

RK: No, because they don’t say. I am the only one who says, and I go and give lectures, and my methodology is public [and] open.

SS: But no one else does this because it’s their “secret sauce.”

RK: I believe there is no “secret” and there is no “sauce.” They just do it ad hoc, is my guess.

SS: Do you think so?

RK: Yes. They will have some rule, “Oh, if it is two parties and only 5% swing then maybe…” Something like that. Some rules.

SS: But not transparent.

RK: No, certainly not transparent.

SS: And not methodical in the way that you’re describing here.

RK: At least, they have not disclosed it.

I would have no issue in debating with anyone my methodology versus theirs, but no one else comes forward.

On transparency of pollsters and disclosure of methodology

SS: You spoke at the beginning about transparency in disclosure of methodology, and how you were looking at this in the ‘90s. Has it changed a whole lot from the ‘90s to now?

RK: Not really. Not really. We zeroed onto a fairly good methodology and it’s these tweaks, little tweaks, taking into account the ground realities which we may be doing, but other than that the primary methodology has remained the same.

SS: In terms of the reporting, the disclosure of how these surveys are done. I’ve talked to people in media but also researchers themselves who say, “We need to get better about disclosing our methodologies.”

RK: Absolutely. The only thing is it cannot be mandated by law, but it should be voluntary by the media.

SS: Why can’t it be mandated by law?

RK: There is a very tricky issue. The regulation of media is a touchy issue. What can be regulated and what cannot be regulated is also debatable.

For television stations, the government can come out with a law because these are of recent origin and when they got licensed to uplink their signal from the land, they signed on [to] all kinds of norms, and opened themselves for regulation by the government. So they can be regulated. Otherwise, their license can be cancelled.

However, when it comes to print media — because print media predates independence by a century — print media cannot be regulated. In fact, Mr. N. Ram [the publisher] of The Hindu [has given] lectures in which he has categorically said that, “We obey your 48-hour [embargo on releasing polls before Election Day] of the Election Commission, because we think it’s a good idea. Not because we are bound to it.”

SS: So there has to be some kind of internal self-regulation.

RK: There is something called the Press Council in India. There is some other body for the television broadcasters association. These bodies can together come up — Yogendra had suggested this, and I strongly support that — at least, when was the survey done, what was the sample size, who led the survey, what [were] the names of the team members who did that. And a little more about the vote percentages / seat percentages methodology.

After Yogendra became a member of AAP, he in fact said that the entire survey data should be made public. That maybe is going a bit too far. But it’s like an audit. No company accounts are made public. However, it is subject to an audit by a professional. Likewise, we can be required to maintain records in a certain fashion that can be subject to audit by other professionals.

SS: Why do you think this hasn’t happened yet, if Yogendra has talked about it, if so many researchers have talked about it, [and] people in the media are talking about it?

RK: I’m not sure how many actually want it. Yogendra definitely wants it. I want it. But for both of us, I have to say, that for both of us it was easy to say and even practice because this was not our primary profession. We both have a different profession. We have our livelihood secure. Whether this opinion poll happens or doesn’t happen, whether we are engaged or not engaged, we don’t care.

With Rajdeep Sardesai and me and Yogendra, we had a very clear understanding that Rajdeep had no say in what we were going to say. He could say, “Hey, what you are saying doesn’t seem to be happening here. It’s not going to happen.” We would say, “Okay, we’ll examine our analysis once more.” But he could not tell us to reduce it or increase it, no.

SS: You had complete independence from media sponsors.

RK: Yes, absolutely.

On experience conducting surveys over time

SS: You got experience not just doing the vote-to-seat projection but also working on the surveys themselves. The specific focus of my research is systematic sampling error in the polls, whether there are certain populations, demographic groups — gender, religion, caste-based — that are more or less likely to be represented in these surveys. Did you find consistent patterns related to that?

RK: With CSDS data, I don’t think there was a consistent pattern [other than] underprivileged groups being slightly underrepresented.

SS: Which types of groups specifically?

RK: Rural areas, SC [Scheduled Castes], ST [Scheduled Tribes]. But this, Sanjay [Kumar, Director of CSDS] would have a much better sense than me.

But with other surveys, there are grave doubts about what happens, when they do convenience sampling. You know most others do convenience sampling, quota sampling.

I have [put] on record [several times] that whatever [are] my successes or limited successes is largely due to great quality data by CSDS. If the data is no good, my analysis cannot be good to begin with. And CSDS has done a phenomenal job.

SS: Have you found that certain states are harder to project than others?

RK: Not really, though our track record said that we did very well in east India and better in south India. Not so well in west India. (laughs) But it’s by chance. I am not sure why it happened.

In the east, [Lokniti and I] have a phenomenal record.

SS: The northeast?

RK: East, not so much northeast. [In the] northeast, only [in] Assam have we done surveys. Assam, we have done very well. Bengal and Bihar, we have got bang on when others did not expect it. Both victories of Nitish. the first time in 2005 that he gets a majority, and then in 2010 [when] they crossed 200 [Vidhan Sabha seats], nobody had dreamt. Nitish himself had not dreamt. And we got it.

SS: And you did very well for southern states as well?

RK: Reasonably well for southern [states]. Not so well for [the] west. I am from [the] west. (laughs)

SS: Which southern states and western states do you mean?

RK: Gujarat in particular, we had a checkered record. Not so good. Maharashtra, so-so. But [the] south: Tamil Nadu we have done well, Karnataka we have done well. AP [Andhra Pradesh] okay.

SS: Thank you very much.