Practical Labs

Universidade Nova de Lisboa | NOVA FCSH | iNOVA Media Lab
Digital Media Winter Institute I SMART Data Sprint 2021 
The current state of platformisation I 01 – 05 February 2021 I 9:00 – 18:00  (Lisbon time)
#SMARTdatasprint | Research Blog | Facebook Group: SMART Data Sprint | @iNOVAmedialab

SLACK: SMARTDataSprint 2021

˚˚ Practical Labs  2021 (timetable)

MONDAY (AFTERNOON), 1 Feb 2021

BEGINNERS TO INTERMEDIATE
Zoom Link

INTERMEDIATE TO ADVANCED
Zoom Link

13h00 – 13h50

Querying digital platforms & extracting data
Janna Joceli Omena

13h00 – 14h00

Opening up the black-box of mobile apps’ traffic
Jason Chao

13h50 – 15h00

Using APIs with Facepager
Jakob Jünger

14h00 – 15h30

Studying memes through platforms data spreadsheet
Elena Pilipets

10m break

15h10 – 16h40

Content Analysis on Instagram 
Ana Marta M. Flores

15h40 – 17h10

Visualising image clusters with Gephi
Density Design Lab Team

16h40 – 18h00

Basic tricks for working with data
Fábio Gouveia

17h10 – 18h00

Querying digital platforms & extracting data
Janna Joceli Omena

 

TUESDAY (MORNING), 2 Feb 2021

BEGINNERS TO INTERMEDIATE
Zoom Link

INTERMEDIATE TO ADVANCED
Zoom Link

09h00 – 10h00

Webscraping with Facepager
Jakob Jünger

09h00 – 10h20

Using AI to enrich image data
Jason Chao

10h00 – 10h45

Reading digital networks
Janna Joceli Omena

10h20 – 11h40

Analysing Images by content similarity with computer vision (CLARIFAI)
Density Design Lab Team

10h45 – 11h45

Gephi for Beginners
Leonardo Melgaço

 





TUESDAY (AFTERNOON), 2 Feb 2021

BEGINNERS TO INTERMEDIATE
Zoom Link

INTERMEDIATE TO ADVANCED
Zoom Link 

INTERMEDIATE TO ADVANCED
Zoom Link

15h30 – 17h00

Visualising image clusters with Gephi
Density Design Lab Team

15h30 – 18h00

Mapping Twitter social media networks with NodeXL Pro
Marc Smith

15h30 – 16h30

Reading digital networks
Janna Joceli Omena

17h00 – 18h00

Mapping gender issue-networks with Wikipedia: from co-word occurrences to topics of interest
Leonardo Melgaço
Gracila Vilaça

16h30 – 18h00

Advanced tricks for working with data
Fábio Gouveia

 

˚˚ Practical Labs  2021 (info)

1. Practical Lab 

Querying digital platforms & extracting data

#slides/folder URL:

https://drive.google.com/file/d/1nzOKYJPnRoaof8Sbl4NwCt3DxnJglc6u/view?usp=sharing

2. Facilitators 

Janna Joceli Omena 

3. Short Description 

This workshop addresses some aspects to be considered when querying digital platforms and extracting data. From the formulation of research questions as queries (query design) to the use of data extraction tools (software practices), we will reflect on situations in which both software and the researcher’s decision intervene, re-adjust and re-shape representations of online activity.

4. Requirements 

We will work with YouTube Data Tools (Rieder, 2015) and Google Spreadsheets. 

 

1. Practical Lab 

Content analysis on Instagram

#slides URL/pdf:

http://bit.ly/instagram_AMF

2. Facilitators 

Ana Marta M. Flores

3. Short Description 

What kind of questions can one answer having access to Instagram data? We are going to work with metrics (likes count, comment count, captions, etc.) and content (image URLs, time/date, captions) from a specific Instagram dataset.To do so, we will start organizing and cleaning the dataset using some shortcuts and cell formulas on Google Spreadsheets. Then, we will perform a preliminary content analysis by identifying and combining: (1) visual patterns/categories on the dataset; (2) high and low engagement posts and (3) most frequently used emojis.Finally, we will develop graphs with this same dataset on Raw Graphs. Visualizations such as treemap (hierarchy), streamgraph (time series) or alluvial diagram (multi categorical) can be built to better present the findings.

4. Requirements 

Participants must have a basic knowledge of Google Spreadsheets. Web-based and open tools such as Text Analysis and Raw Graphs will be used in this Practical Lab. 

 

1. Practical Lab 

Studying memes through platform data spreadsheets 

  #slides/folder URL:

http://bit.ly/memes_data2021 

2. Facilitators 

Elena Pilipets

3. Short Description 

This practical lab focuses on the possibilities of exploring memes and other visual vernaculars (e.g., screenshots, GIFs) through platform data spreadsheets. We will first practice how to filter and organize images according to their digital attributes (e.g., time of posting, engagement metrics, hashtags, image captions, etc.) in Google Spreadsheets. In the second step we will learn how to download the images using a list of image URLs and extensions such as DownThemAll to analyze the result based on visual color patterns and temporality. To this end, we will use ImageSorter. In addition, this practical lab offers a discussion of variously designed data visualizations and their narrative possibilities with regard to 1. the contextual situatedness of memes provided through co-tag relations; 2. the restrictions of looking only at the content with the most exposure 3. the patterns of image adaptation over time as indicator of shifts in relations of relevance.

4. Requirements 

Please create a Google account to use Google Drive and install the following extensions/visualization tools: DownThemAll for Google Chrome; Image Sorter Windows or Image Sorter Mac. During the workshop, we will also use text analysis tools such as TagCrowd and TextAnalysis. A paper by Sabine Niederer and Gabriele Colombo presenting digital methods approach to image research can be found here

 

1. Practical Lab 

Using APIs with Facepager

  #slides/folder URL:

Facepager SlidesFacepager Cheat sheetFacepager UsergroupPlease use the hashtag #SMARTdatasprint
when asking questions in the usergroup.

2. Facilitators 

Jakob Jünger

3. Short Description 

In the session you will learn how to use APIs from online platforms such as Facebook, YouTube, Twitter, or Wikipedia for automated data collection. After a short introduction into the basics of application programming interfaces we will collect comments for text analysis and links for network analysis. The practical lab introduces you to Facepager, a versatile tool for automated data collection.

4. Requirements 

Install and run the latest version of Facepager from https://github.com/strohne/Facepager. Excel, Numbers, R, Python or a similar software for reading spreadsheet data.

 

1. Practical Lab 

Webscraping with Facepager

#slides/folder URL:

Facepager SlidesFacepager Cheat sheetFacepager UsergroupPlease use the hashtag #SMARTdatasprint
when asking questions in the usergroup.

2. Facilitators 

Jakob Jünger

3. Short Description 

Webscraping refers to the automated extraction of data from webpages. It can be used whenever no API is available. After a short introduction to different techniques and hurdles we will extract data from news pages for text analysis and URLs for network analysis. The practical lab introduces you to Facepager, a versatile tool for automated data collection.

4. Requirements 

Install and run the latest version of Facepager from https://github.com/strohne/Facepager. Excel, Numbers, R, Python or a similar software for reading spreadsheet data.

 

1. Practical Lab 

Basic tricks for working with data

#slides/folder URL:

Slides

2. Facilitators 

Fábio Gouveia

3. Short Description 

Dealing with data in text files sometimes can be challenging for beginners. To keep track of their origin, avoid issues during importing, and joining them may need some knowledge that is not easily part of everyday office work.The main goal of this lab is to give some basic tips and tricks to deal with files for further data analysis and visualization.This practical lab will focus on name attribution concerns, file preparation and data consolidation. Some basic approach to shell (command prompt) usage, and text code page issues will also be part of this practical lab.

4. Requirements 

Participants need to have  their own computer. We will mainly work using a simple text editor like notepad (Sublime Text 3 or Notepad ++ is best) and a spreadsheet like Google Sheets or Excel. We will also briefly explore the shell commands (command prompt) to perform some tasks, so administrator privileges may be necessary.

 

1. Practical Lab 

Advanced tricks for working with data

#slides/folder URL:

Slides

2. Facilitators 

Fábio Gouveia

3. Short Description 

To deal with large amounts of text files are part of any data scientist regular activity. Challenges arise when you find that you can rely no more on simple text editors or spreadsheets, even in their 64bits versions. Also, structured file formats, although powerful, starts to confront the now intermediate data scientist.The main goal of this lab is to give some intermediate tips and tricks to deal with files for further data analysis and visualization.This practical lab will focus on understanding some more structured file formats and to perform some operations to prepare them to further usage. Some shell command (command prompt) usage, simple software execution and an introduction to Open Refine tool will also be part of this practical lab.

4. Requirements 

Participants need to have their own computer. We will mainly work using a simple text editor like notepad (Sublime Text 3 or Notepad ++ is best), spreadsheets like Google Sheets or Excel and Open Refine. We will also explore shell commands (command prompt) to perform some tasks, so administrator privileges may be necessary. Knowledge related to the beginner practical labs is desired.

 

1. Practical Lab 

Using AI to enrich image data

#slides/folder URL:

https://github.com/jason-chao/workshops/blob/main/2021/Feb-SMART2021/enrich_image_data.md 

2. Facilitators 

Jason Chao

3. Short Description 

This practical lab will introduce the affordances of Google Vision API in social research and the tool Memespector-GUI.  Google Vision API is a powerful tool widely used in business to derive intelligence from images.The participants will learn how to:1. Make sense of the semi-structured output of the API;2. Repurpose the API to analyse social media images, as an example; and3. Apply secret tricks to keep using the API for free (or at least paying as little as possible).

4. Requirements 

Participants need to bring their own computer and have at least one Google/Gmail account and a payment card.

 

1. Practical Lab 

Mapping gender issue-networks with Wikipedia: from co-word occurrences to topics of interest

  #slides/folder URL:

http://bit.ly/WomenInTech_WORDij

2. Facilitators 

Leonardo Melgaço
Gracila Vilaça

3. Short Description 

In this practical lab, we will use WORDij, a system based on the linkage strength between words (DANOWSKI, 2013), to process text files from Wikipedia articles. We will focus on the network output from WORDij based on co-word occurrences. By critically analysing word’s associations, within a network in Gephi (BASTIAN; HEYMANN; JACOMY, 2019), we will be able to map topics of interest within an issue-network (ROGERS, 2018). Empirically, we investigate gender gap issues related to the ambivalent movement “women in technology” (CHAU, 2017).

4. Requirements 

Please make sure Wordij (<https://www.wordij.net/index.html>) and Gephi (<https://gephi.org>) are installed and running beforehand. We recommend the participants to download JDK “Java Development Kit”. Optical mouse facilitates network navigation in Gephi.

 

1. Practical Lab 

Opening up the black-box of mobile apps’ traffic

#slides/folder URL:

https://github.com/jason-chao/workshops/blob/main/2021/Feb-SMART2021/blackbox_of_mobile_apps_traffic.md 

2. Facilitators 

Jason Chao

3. Short Description 

Unbeknown to the users, many mobile applications (apps) send out questionably huge amounts of data to suspicious destinations which call the purposes of those apps into question.  Inspecting the network traffic of digital devices used to require a lab set-up or some technical knowledge.This practical lab will introduce AppTraffic – a newly developed tool aimed at empowering researchers from different backgrounds to easily decrypt and study the data travelling in and out of mobile apps.

4. Requirements 

Participants need to bring their mobile device and their own computer.  For the mobile device, Apple’s iOS device (iPhone or iPad) is recommended

 

1. Practical Lab 

Gephi for beginners

#slides/folder URL:

http://bit.ly/GephiForBeginners

2. Facilitators 

Leonardo Melgaço

3. Short Description 

In this practical lab we will learn how to create and, most importantly, how to unfold a complex network on Gephi (https://gephi.org). We are going to explore the software’s interface and navigate through a network while addressing three central aspects: network spacialization (ForceAtlas2 layout algorithm), network aesthetics (nodes colors and sizes) and statistical modularity (clusters analysis).

4. Requirements 

Participants need to install Gephi (https://gephi.org) and make sure it’s running. Tip: download JDK “Java Development Kit”. Optical mouse facilitates network navigation in Gephi.

 

1. Practical Lab 

Mapping Twitter social media networks with NodeXL Pro

#slides/folder URL:

TBA

2. Facilitators 

Marc Smith

3. Short Description 

In this practical lab, we will use NodeXL Pro (https://nodexlgraphgallery.org) to collect, analyze, visualize and report on social media networks from Twitter. A range of topics, hashtags, URLs, and usernames can be mapped. The variety of social media network structures will be reviewed.mWe will learn to interpret and narrate these networks to others. The practical lab  will include the automation features in NodeXL that ensure the creation of readable and complete network and content analysis and visualization.We will explore the many stories and insights that can be extracted and presented through social media network analysis:

  • who are the leaders or most influential contributors,

  • how are sub-groups or internal divisions formed in the population,

  • what topics, URLs, hashtags, and users are most commonly discussed?

Using these analytic elements, we will explore the ways these insights tell a story about populations, leaders, and topics over time. 

4. Requirements 

A computer or tablet is helpful. Windows and Office users will be provided with a courtesy license for the NodeXL Pro application.Mac and Tablet users will be provided with a courtesy month of access to the NodeXL Pro Cloud edition service.Download NodeXL Pro from: https://nodexlgraphgallery.org/Pages/Default.aspxThese YouTube videos are useful introductions to NodeXL and mapping Twitter social media networks:https://www.youtube.com/watch?v=kDiGl-2m868https://youtu.be/mjAq8eA7uOMThis article from Pew Research is a good written introduction to social media network maps of Twitter:http://www.pewinternet.org/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/

 

1. Practical Lab 

Reading Digital Networks

#slides/folder URL:

bit.ly/reading-digital-networks-JJO

bit.ly/gephi-basic-guide_JJO

2. Facilitators 

Janna Joceli Omena 

3. Short Description 

This workshop introduces methodological strategies to read digital networks according to their visual affordances and technological grammar, also taking into account three different but related aspects: the triad grammatisation-cultures of use-software. We will ask and respond to the following questions: what to look at when reading networks? What does node position and connections mean? How to visually interpret networks?

4. Requirements 

We will work with Gephi (Bastian, Heymann & Jacomy, 2019).

 

1. Practical Lab 

Analysing Images by content similarity with computer vision (CLARIFAI)

#slides/folder URL:

https://drive.google.com/drive/folders/1Ak7vKGB831073OIs_V_nBZvUSA1Ox2by?usp=sharing

2. Facilitators 

Antonella Autuori, Matteo Bettini, Andrea Elena Febres Medina

3. Short Description 

This approach helps to cluster and visualize images in a series, according to how their content is classified by machine learning algorithms. This can be used to define as well as measure thematic visual clusters within a series of images. It is similar to an overview of a co-hashtag, but done with visual content. In reality, for each image, one creates tags with the assistance of computer vision and then uses mutual tags to visually cluster related images. Four major steps are taking place. First, with the help of a computer vision API, photographs are tagged. Second, photographs are locally downloaded and saved. Third, in Gephi, a network of images and tags is constructed and visualized. Finally, images are loaded into the network and exported.

4. Requirements 

Gephi installed and signed up on Clarifai. Image Preview Plugin installed in Gephi.

 

1. Practical Lab 

Visualising image clusters with Gephi

#slides/folder URL:

https://drive.google.com/drive/folders/1BrbisV6K3_DBHlI0RAdfykorVMqhOOya?usp=sharing

2. Facilitators 

Antonella Autuori, Matteo Bettini, Andrea Elena Febres Medina

3. Short Description 

This practical lab helps to cluster and visualize images in a hashtag-image network in the context of social media research. This means that images with similar hashtags can be analysed and visualised based on co-hashtag relations. The main steps to be discussed are how to extract a meaningful network from a spreadsheet file using Table2Net and how to visualize, spatialize and analyze it using Gephi’s image preview plugin.

4. Requirements 

Please have Gephi and Image Preview Plugin installed and please install DownThemAll before the workshop.