Journalism Apps

Digital Media Winter Institute 2019
SMART Data Sprint: Beyond Visible Engagement
28 January – 1 February 2019
Universidade Nova de Lisboa | NOVA FCSH | iNOVA Media Lab˚

Project title: Exploring the News Apps Environment on Google Play Store

Facilitators: Dora Santos Silva and Mariana Müller.
Team members (alphabetical order): Ana Marta M. Flores | Cristiana Freitas | Florian Lang | Harald Meier | Louise Knops | Marcia Lisboa and Roli Monk.

Key Findings
Introduction
Research Questions
Research Design Visual Protocol
Methodology
Findings
Discussion&Conclusion
References

Project pitch: here
Final presentation slides: here

Key Findings

  1. Journalists are not the only players in the news apps environment

  2. The first dataset (search with Category) gave us a more “accurate” network related to News and Journalism than the second one (with Keyword). On the other hand, the Keyword Network gave us a bigger and more complex network that the concept of “news” are related with information.  We can say that a search with keywords could be more related with “misinformation”.

  3. In APPs environment, the concept of News (in a Journalistic perspective) is very close to the common-sense idea of “information”.  

  4. Tech companies are very close to news publishers cluster, which can reflect the current hybrid context  around news and technical developers.

  5. The forms of usage search reflect different results scenario, which can tell that the app news environment is really dynamic and personalized and that is difficult to map.

  6. There is a really ephemeral flow in the news app environment – the location and dynamics rapidly evolve.

Introduction

The study of apps in the media field is scarce and there is still a lack of notion that these are really key vectors of the digital culture and can reveal important insights about dynamics, behaviours and innovation.  In the context of mobile news, there are multiple means of distribution, from news alerts by sms, whatsapp, to mobile news sites and mobile news applications (commonly referred as apps). This project focuses in the study of the news apps, since its consumption is gaining popularity and grew 300 per cent in the last five years, outpacing desktops or laptops (Fedeli & Matsa, 2018).

The growth of customised mobile news apps is not exclusive of the legacy media, trying to enhance their cross-media portfolios. Other players, mainly news aggregators and curators, such as Feedly or Google Feed, also play an important role in the market.

Research about mobile news apps has been focused on the multimediality, interactivity and commercialization models adopted by newspaper publishers (Cantero et. al., 2017), new editorial models and journalistic formats, from virtual reality to newsgames (Baccin et. al., 2017; Fernandes, 2017) or case-studies regarding specific news media apps, such as The NYT Now (Torres, 2015) or The Guardian’s (Silveira, 2017).

However, if we take into account that journalism is a construction of a reality and is responsible for  the world that appears to the reader – from the one created by a scandals tabloid to the one that is created by the fragmented consumption of news through Facebook –  the apps ecosystem also constructs virtual words, starting from the apps that fall in the category “news” in the app or play store (are they really news?) This phenomenon is enhanced by the misinformation and fake news reality – if we look for a news app in the IOS or Android store, will we find really news apps? In this context, since it is the publisher that chooses the category in which an app is located, does the app ecosystem really reflects a world of news? 

Research Questions

How is the news app environment?

How are news addressed through apps in different usage scenarios (Category “News and Magazine” and keywords search)? 

How is the news app environment in different usage scenarios?

What is the news content accessible to users in this news app environment?

Research Design Visual Protocol

To answer these research questions, three usage scenarios were considered, taking into account specific data and tools to extract it.

Usage scenario 1: user searches by category in Google Play Store (‘News and Magazines’)

Data collected (Apps unique identifiers) from Google Play store with link klipper

Details extracted with DMI tool ‘Google Play Similar Apps’

Usage scenario 2: user searches with keyword (“notícias”, news in Portuguese) in Google Play Store

Data collected using only a keyword in Portuguese: Notícias. Data collected (Apps unique identifiers) from Google Play store with link clipper. Details extracted with DMI tool ‘Google Play Similar Apps’

An experiment: personalization and google app store searches

Data collected using Category News and Magazine with Group Members (two different scenarios), extracted with Linkgopher and run in NodeXL. 

Methodology

Run a trial with Gephi and NodeXL (an experiment focused on personalization)

Quantitative/Qualitative Analysis of the networks generated by Gephi

Findings

In the central layer we could see that the biggest nodes are related to journalism or news, but on a second level it is possible to see that the next cluster of related apps is poorly identified as journalism. However, the apps related in this level are constantly updated (such as TED ou 9gag) and maybe this structure is understood by the platform as a form of news update. We couldn´t achieve a clearer conclusion.

Generic Observations about Categorical Data

The network “Search by Category” displays Google Play Store’s recommendation algorithm wherein, in the center, we find big, transnational news organizations, tech companies and news aggregators that are bridging into more local markets. In other words, the central cluster represents apps which are generally more popular news outlets, a couple of names that one can recognize BBC, New York Times, News Republic (India), Times Internet Limited; that are downloaded regularly by users (in other words number of downloads also impacts the recommendations).

Most clusters in the immediate vicinity of the center resemble country markets. For example, the dark grey area to the upper right is the German app market, the blue-green cluster at the very top is the Russian market, the blue cluster on its left the French market. Less apparent, some of the clusters also display specific topics news. In the center left, for example, the brown cluster resembles the financial news apps. 

Interestingly, there are apps that specifically link local (news) app markets to the center. It is not necessarily clear why these apps occupy these bridging positions. For Germany, for example, the NTV news app has this function. This news channel would generally be considered a mid-field player in the journalism landscape in Germany.

In the network above, we see a closer look at the German market, showing connections between apps not even related to journalism or news. This cluster, though, is very close to the node that represents Der Spiegel, one of the most prestigious weekly magazines in the country.

Not systematically though, there are also apps that might bridge different geographies while avoiding transnational hubs. However, why this is this the way it is and what it exactly means would require more elaborate research.

At the first level of analysis what is also clear is that most of the clusters are connected by technology and different technological giants connect to different news clusters divided in territorial clusters.

Some key words to keep in mind when considering the general categorical choices in the google play store are: Popularity, Territory and Paid advertisements for pumping up visibility, for example.

Generic Observations about Keywords

Although the largest network is composed by news from newspaper companies, especially American and European newspapers / channels (in English), the role of journalism as an information producer and distributor is put in check, considering the environment of multiple actors in that scenario.

Among these actors, religious institutions or groups are the most expressive: they are present in four clusters: orange (the most specific one) black,  green and blue. 

The results have also highlighted a strong presence of sports apps (journalistic and football club). Ex: Uol Brasileirão 2018 (Brazilian football championship).

We could observe that traditional journalistic areas didn’t appear as specific sectors, in the search results (in Portuguese), such as: Politics, Health [however the map includes pages with detox food, and a web diet app – suggestions for patients], and Cultural themes (cinema, literature, visual arts etc).

Network Analysis – News Category – Green Area 

The largest Green Area nodes are “Jornal de Portugal” (PT) and “Plantão Brasil Notícias” (BR).

They are connected to each other and feed countless smaller Apps. The App “Jornal de Portugal” is connected to portuguese representative newscasts, such as “Informações ao Minuto”, “Rádio Positiva Portugal”, “Sapo Jornais”, “MSBN”, among others.

“Jornal de Portugal” (PT) also connects with Apps do Brasil and Mozambique, such as “Plantão Brasil de Notícias” (Brazil) and “Angola Notícias” (Mozambique). The App has great penetration in small radios of the interior of Brazil, like “Radio Taiobeiras Livre”, in Minas Gerais, BR. Also, connections with Religious Apps such as the “Harpa Cristã Assembléia de Deus” and the “Igreja Aliança”, in Jacarepaguá, BR. Commercial connections also attract attention, as with “Cartão Continente”, “Santander”, “Banco do Brasil”, and “Unimed BH”.

The App “Plantão Brasil de Notícias” (BR) makes connection with the media Apps of “Jornal de Portugal”, “Jornais Portugueses”, “Notícias e Jornais de Portugal”, “Angola Notícias”, “Notícias Angola”, “Sapo Jornais”, “Sapo Desporto”, “InfoMoney”. Other minor ones are “TV1234”, “Financiamento Lojista” and “Caminhos da Fé”.

Following these two, the main Apps are:

(PT) – “InfoMoney”, “Jornais Portugueses”, “Notícias e Jornais de Portugal”, “Sapo Jornais”, “Informação ao Minuto – Notícias de Portugal”, “Sapo Desporto”, “Mais Futebol”.

(BR) – “G1 Practice”, “Globo News Play”, “Record TV”.

(África) – “Angola Notícias”, “IMW Boiuna”.

Many (21) expressive nodes do not have a denomination (as in name info), as one connected to the “Plantão Brasil de Notícias”, to which are connected several 35 national and local media Apps, such as “Rádio Online Portugal”, “Radio Brasil Atual”, “Rádio Online FM” and “Web Radio”. In this case, if we check the “label” on “Data Laboratory”, it is possible to retrieve the id of the app and, then, identify the apps.

When checking the id of some of these “anonymous” clusters, we find big news portals, such as the Portuguese “Notícias de Portugal” and “RTP Notícias”; the Brazilians “G1 O Portal de Notícias da Globo”, “O Globo Notícias”; News Apps aggregators such as “Notícias do Brasil”; and others from local developer initiatives such as “News Brasil Blue Blood”, or ” Portal de Notícias.Net”.

We noticed that the Apps of Brazilian TV stations are connected: “TV Record”, also accesses the “Band” or “Globo News”; and the Apps of the great Brazilian printed newspapers are not as expressive, as the newspaper App “O Globo”, “Folha de São Paulo”, “O Estado de São Paulo” or “Jornal do Brasil”. Radio apps reproduce content produced by print media or TV.

It is interesting to notoce that religious apps are closely connected to the biggest node addressed to “TV Record”, a Brazilian broadcast network that has strong relations, as in the The Universal Church of the Kingdom of God owns the TV company.

In the news cluster, there are jobs (Human Resources), classified, religious services (“Bíblia para Mulher MP3”, “Católico Orante”, “Bíblia Sagrada”, and many others), lottery, financial information, government, dictionary and recipes.

Many of the connections of this cluster with the others occur in the relationship with the great Apps. For example, the “UOL News App in Real Time” (blue area), relates to Apps such as the “Plantão Brasil de Notícias”, “Notícias e Jornais de Portugal” e “Folha de São Paulo Impressa”; with the “Expresso” (Gray Area), with CNN, NBC and Reuters (Pink); “Flamengo TV e Notícias” and others without identification (Orange).

Purple Area 

In the core of Purple Area (cluster) we can identify news reference as “NYTimes”, “BBC” and “The Guardian”, “Reuters”, “CNN” and also aggregators (“Flipboard” and “Bundle”, for instance). There is an important node, “Conservative News”, that makes the bridge between purple and green area, the second most relevant in this network.

It is also relevant to mention that “Microsoft Nieuws” (dutch) is an big node that make a connection with Pink Area, that is more focused in “service” like “Gmail”, “Microsoft PowerPoint” and even “WhatsApp”. It is very interesting that “WhatsApp” (frequently used to spread the news) is in completely in Pink Area, out of the main news area. Around “WhatsApp” we can find “Microsoft PowerPoint”, “Microsoft OneNote” and “Microsoft Word”. 

We also can see some “Twitter” as a relevant node between Purple Area and Pink Area. It is expected that “Twitter” has a stronger relationship with News.

There is a relevant cluster in Purple Area related to science news, included “NASA” and “Khan Academy” and some scientific journals.

Blue area

“UOL – Notícias em Tempo Real” is a big node distant from the rest of the blue network, connected to apps that are not related to news, such as: “A Bíblia Sagrada – NVI”, “Ofertaço Brasil”, “Yahoo Mail Blijf georganiseerd” and “Caixa”. On the other hand, there’s also some apps which are, indeed, specific for news, developed by News Companies: “Gazeta do Povo Mobile”, “Diário do Nordeste”, “Estadão Mobile” and “Eleven”.  

The most relevant nodes (as in bigger) are the following: “Placar UOL – Brasileirão 2018”, “Globoesporte.com”, “Notícias do Vasco da Gama”, “Brasileirão Pro 2019 – Série A e B”, “SouSporting – Notícias do Sporting”, “Segunda Liga”, “365Scores – Live Uitslagen”, “BeSoccer – Soccer Live Score”, “FotMob – Live Football Scores”, “Goal.com”, “Goal Live Scores”, “All Football – Latest News & Live Scores”, “UEFA Champions League”,  “MSN Sport”, “NBA app”, “Benfica Official App”, for instance. Another interesting fact is the repetition of nodes with more relevance on this blue network that aren’t identified with a name or a label. In these cases, on the “data laboratory” we are only able to know the id of the app and, with this info, track the link and check manually the data about the app.

Some of the nameless apps are also very strongly related to soccer or sports. Examples: “90min – Live Soccer News App”, “FotMob – Live Soccer Scores”, “LiveSoccer live scores: FIFA World Cup 2018”, “Fogão Notícias do Botafogo”, “Vascão Notícias do Vasco”, “Bahêa Notícias do Bahia” (these last three apps were developed by the same company – Brafoot).

Generally analyzing, there is pretty much evident that sports, and more specifically soccer teams and soccer championships, are the main theme of this part of the net. 

Orange cluster

Observing isolated clusters: a fairly homogeneous cluster (orange) is grouped by Adventist church actors. They’re not addressed by anyone / anything, but address news to three pages: Escuela Sabatica 24/7, Hinário Adventista and Harpa Cristã. The cluster represent 7,25% of the total results, which is significant comparing to other clusters.

Black cluster

In the opposite direction of the orange cluster, the black one is composed of nodes that address quite different topics. Although there are significant nodes of newspapers sites and portals (Folha de S.Paulo, Extra Notícias, Expresso, R7 and TV Portugal), the main content in the cluster is produced by non-journalistic channels. Among these, there are expressive nodes about culinary, religious institutions or movements, and services (such as complementary courses, cell phone recharge, promotional airline tickets, guide line for diets, pharmacies, dictionaries, football teams sites etc).

The exception is the node Jovem Net Official, that is linked to nodes which deal with  different subjects.. One of them is the app “Cabo Daciolo Sounds”, which aims to address the best speeches of a candidate for the presidency of Brazil in 2018. It highlights the presence [usage?] of humor. In this case, through the circulation of memes.

Price and rating

Two variables related with users experience (price and rating) were collected and processed with Raw Graphs (rawgraphs.io), for Keyword sampling purposes. The aim was to comprehend other characteristics of this scenario by visualizing the amount of free and paid apps. We found out that the majority are free applications with high ratings (4 and 5). There are some paid applications with high ratings (black points), although they are a minority. We also discovered that there are applications (a relevant number) without payment or rating information in our sampling (yellow points in the graph below). Unfortunately, it is not clear the reason why there is no information available about them.

In the second graph (also created with Raw Graphs) we can see how price and rating are connected – two fundamental variables from the users’ point of view. The darker region represents a greater concentration of applications. There are a lot of apps around 4.6 (rating) and 3 dollars (price) in the main area (on the left). The secondary zone (on the right side) is composed by the most expensive apps (more than 7 dollars) with an approximate 4.4 rating.

Limitations
No information available on the monetary aspect of the apps (i.e. advertising and SEO of the apps within the google play store), thence we are not able to infer how much influence does this have on the node size or the cluster itself. Datasets were difficult to access as we would need more user data to really make more robust observations.

An experiment: Personalized results

Scenario 1:
Put simply, in our current experimental approach, we had six users use their personal laptops to extract links for the search query: google play store> app > categories > news and magazines. These users did not use any keyword but just the news and magazines category within the google play store. The users used different explorers.

Then using NodeXL for visualizing our data, we found that users who had not logged into their google accounts did not get personalized results or recommendations. That is the first layer of the recommendation cycle however based on the analysis we are confident that there are many layers to this recommendation algorithm and the second layer seems to be more complex and needs further investigation, maybe in the next sprint 🙂
Based on this visualization, we can also say that the recommendations given to the logged in users are also influenced by their geography. Harald’s data clearly shows the german preference. 

Scenario 2:
In this visualization we can see 6 different clusters symbolizing the 6 users who were all logged in to their google accounts, the grey nodes show that only person in the treatment group, i.e. people who had logged into their google account got that recommendation, so in technical words: grey signifies in-degree of 1.
The size of the node represents the ranking or how many users in our treatment group were recommended that app, in other words the indegree where the maximum number can only be 6 as we had 6 people who were a part of this experiment.

A note, when one considers the results from the two scenarios, 19 recommendations were given to all 6 people in comparison to 24 in the first scenario, this means that logging in to a google account narrows down our recommendations field.

Another interesting fact that is of note is that in the second scenario, Ana Marta and Roli’s data is extracted from a mac environment wherein both of them use IoS machines (laptops and mobiles), thence they have not used google app store in the previous history of owning these devices. This difference in the interfaces can and should be investigated to get further insights on the recomendations algorithm with more users and multiple countries.

This key finding might seem obvious but our experiments with treatment groups show that there are several layers to the recommendation algorithm and the first layer or step of that algorithm is dependent upon whether a user is logged in or not. This experiment uses different tools to confirm a theory but it also pushes us to explore other arenas within the news app environment wherein we might want to explore the corporate structures of the app environment and its impact on recommendations. Of course, how advertising impacts recommendations is an area that we are not sure that we can explore given the lack of availability of open data. 

Discussion&Conclusion

There isn’t a clear difference between “news” and “information”. The apps that appear categorized as news or journalism aren’t always news in a journalistic approach, rather institutional or even marketing content. So, in the apps ecosystem, it is more difficult to map news apps in comparison, for example, the print or TV ecosystem. As a consequence, the apps ecosystem is attractive to engagement strategies related to misinformation and fake news.

  • Our first dataset (search with Category) gave us a more “accurate” network related to News and Journalism than the second one (with Keyword). On the other hand, the Keyword Network gave us a bigger and more complex network that the concept of “news” are related with information.  We can say that a search with keywords could be more related with “misinformation”.
  • Journalists are not the only players in the news apps environment.
  • In APPs environment, the concept of News (in a Journalistic perspective) is very close to the common-sense idea of “information”.
  • Tech companies are very close to news publishers cluster, which can reflect the current hybrid context around news and technical developers.
  • The forms of usage search reflect different results scenario, which can tell that the app news environment is really dynamic and personalized and that is difficult to map.
  • There is a really ephemeral flow in the news app environment – the location and dynamics rapidly evolve.

References

Canavilhas, J.; Rodrigues, C. (orgs.)  (2017). Jornalismo Móvel: Linguagem, géneros e modelos de negócio. Covilhã: Labcom. Available in http://labcom-ifp.ubi.pt/livro/289

Canavilhas, J.; Satuf, I. (orgs.) (2015). Jornalismo para Dispositivos Móveis: Produção, distribuição e consumo. Covilhã: Labcom. Available in http://www.labcom-ifp.ubi.pt/livro/137

Cantarero, T. N., González-Neira, A., & Valentini, E. (2017). Newspaper apps for tablets and smartphones in different media systems: A comparative analysis. Journalism. https://doi.org/10.1177/1464884917733589

Dieter M, Gerlitz C, Helmond A, Tkacz N, van der Vlist F, Weltevrede E. (2018) Store, interface, package, connection: Methods and propositions for multi-situated app studies. CRC Media of Cooperation Working Paper Series No. 4, Working Paper, 30 August. Collaborative Research Center 1187 Media of Cooperation: University of Siegen. Available at: http://bit.ly/wps1187-4.

Light, B., Burgess, J., & Duguay, S. (2018). The walkthrough method: An approach to the study of apps. New Media & Society, 20(3), 881–900. https://doi.org/10.1177/1461444816675438

Warren Pearce et al. (2018) Visual cross-platform analysis: digital methods to research social media images, Information, Communication & Society, DOI: 10.1080/1369118X.2018.1486871 

Rogers, Richard. 2013. Digital Methods. Cambridge, MA: MIT Press.