Welcome to Martin Broholt Trans' R&D-area for Curating Data 2020
Open the different parts of this R&D-area by pressing the navigation on the left side. By clicking the name of the folder you'll open the window for it. If you press the filenames you'll download the file.
Introduction
As a means of constructing and organizing my online research and development area, I have been pursuing an aesthetic which takes me back to a time in my childhood where I would borrow my parents’ PC rather than owning one. It represents a time up until a point in my personal life, where technology presented itself as neutral, and the mere thought of being tracked was something from scary books and dystopian television shows. When technology was something which could be turned on and off at our personal leisure, before smartphones were monitoring every move, every heartbeat, every aspect of communication.; back when, through the dial-up modem, the internet presented itself as pay-per-view, rather than pay-by-viewing.
To be clear, I don’t believe the bygone era I am portraying was rid of tracking or surveillance. Rather I’m attempting at characterizing a personal internal change, going from blissfully ignorant to painstakingly conscious. Much like how throughout this semester my accumulated knowledge and academic positioning has changed, as I’d argue that my knowledge and exposure to relevant articles and books has been situated differently throughout the course of the semester, which is demonstrated through the four assignments exhibited in this portfolio. Not solely in a linearly progressive temporal sense, but further in regard to foci and situatedness, drawing on what Donna Haraway (1988) describes as situated knowledges, as a means of stating that knowledge always carries directionality as it’s coming from somewhere, being entangled within a context and pointing at something, as “[t]he only way to find a larger vision is to be somewhere in particular” (Haraway, 1988, p. 590).
I hope to demonstrate that exact point throughout the coming portfolio, where assignment one through three are put in their place with no relation to each other whatsoever as they’ve been teared from their original contexts, depicting my situatedness at the time of handing them in during the semester. The fourth assignment will attempt at making those connections mentioned above clear, and utilizing the prior assignments towards portraying my current situatedness, wrapping up this entire portfolio.
The following assignments are revised versions of what I’ve handed in throughout the semester. In order to be the change I want to see in the world, in regards to transparency, which I feel is otherwise rarely encountered on the platformed web (Helmond, 2015), I’ve appended PDF-versions of the original assignments one through three, as handed in during the semester, which have been postfixed with a “_V1” in the document title, located within their respective folders in the appendix.
Bibliography
- Haraway, D. (1988). Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective. Feminist Studies, 14(3), 575. https://doi.org/10.2307/3178066
- Helmond, A. (2015). The Platformization of the Web: Making Web Data Platform Ready. Social Media + Society, 1(2), 2056305115603080. https://doi.org/10.1177/2056305115603080
Digitizing physical presence
The process of digitizing presence
I have digitized my physical presence on my bike in regards to spatiotemporal dimensions (Wright et al., 2003, p. 48) of data, in order to probe the epistemic implications for such data. In order to accomplish this, I needed to decide on a data-collecting platform fit for the task; I chose Strava because of its approachability, supposed reliability, and – critically – its promises to uphold my rights to export any acquired data relating to me in a machine-readable format, as a EU citizen. I then strapped my phone to my road-bike via a cheaply acquired silicone phone-holder designed for bikes, and routinely and systematically recorded any and all bike rides conducted on my road-bike – meaning regular day-to-day commutes have been excluded from collection. The data collection was in effect from August 27th until September 24th of 2020.
A part of what I wanted to experience and examine was indeed “going in blind” from that point on; not knowing exactly in what format the data would be delivered to me, and, more importantly, with which taxonomic and curational efforts, in terms of the categorization and structuring of metadata, already applied to it, as described by Katharina Weinstock as:
… curatorial exploration of personal cosmologies placed a special emphasis on catalogues, collections, and taxonomies - formats that seem to have proven most qualified to suggest idiosyncratic reconnections of “the order of things” (Weinstock, 2020, p. 231).
The dataset has been appended as the ZIP-file named “export_66439101.zip” in its entirety. I’ve exercised care in the preservation of the original structure and naming-schemes, as a measure of portraying the above-mentioned efforts. I have, though, made sure to anonymize pieces and bits of data which can be considered personally identifiable throughout the dataset. But I haven’t anonymized the start- and end-coordinates from the rides, neither in the “raw” (Gitelman, 2013) activity-data, nor as they have been converted from GPX- into CSV-data via the website “https://www.gpsvisualizer.com”, as I would consider that ‘tampering’ with the data itself. I consider the aggregated dataset consisting of timestamped coordinates as the end result of my inquiry, which has been appended as “20200924080218-22138-data.csv”. Removing singular data entries on a subjective basis would endanger those exact epistemic considerations which I’ve sought to investigate. Some ethical considerations here have been weighed between the ability of the data to expose personally identifiable information (relating to me and myself only, though), and the benefits of research (Markham & Buchanan, 2017).
Data and its representational value
Throughout this process, I’d been reflecting upon the epistemic implications of the data, considering whether the data represents my “being-in-the-world” (Zahavi, 2019), or just arbitrary bits and bytes which happens to afford being read as timestamped coordinates, which turns out to correspond to my bike rides. I found that the epistemic value inscribed with that data is subjective to me, as machines processing the data, or other individuals glancing through it, have no recollection with the physicalness of generating the data – that aspect is mine alone. I take the meaning of this to be, that the data has truthful value to me as I alone can correlate the data to my social knowledge, and consider it as representational value for others.
The data handed over to me, representing my physical being in the world, from Strava consist of mainly quantitatively generated metadata (such as total lengths, durations spent, elevations and derived speeds). But there are a few ‘sprinkles’ of qualitative data, consisting of my perceived exertion, as Strava prompts the user to enter as an activity finishes, and naming of each activity – which is ironically algorithmically determined if not manually filled in, being attributed various quantitative metadata acquired from the activity itself and the chosen privacy settings. The result of this being, that most of my bike-rides were simply named ‘Afternoon Ride’.
Getting the data off of the collection platform (Strava, in this case) was at no time throughout the process considered as difficult, as it is my right, defined in the GDPR (European Parliament and Council of European Union, 2016, article 15). It posed a challenge actually remembering and differentiating my personal memories of the bike rides, and in order to name the individual rides in the final dataset, as a way of distancing myself from the algorithmically derived names. I had no other option than to consult my calendar, and the questionable nature of the nam-ing process is reflected in the names as they appear in the dataset.
Considering my effort as part of mass digitization
Following the same train of thought going forward as Nanna Thylstrup, with “[t]hinking about mass digitization as an ‘assemblage’ [which] allows us to abandon the image of a circumscribed entity in favor of approaching it as an aggregate of many highly varied components and their contingent connections” (Thylstrup, 2018, p. 23), seems to me quite prolific in terms of approaching a such complex phenomenon as mass digitization. The way of perceiving and approaching a process of mass digitization, as Thylstrup describes with a quote from Bruno Latour; “Groups are not silent things, but rather the provisional product of a constant uproar made by the millions of contradictory voices about what is a group and who pertains to what.” (Latour, 2005, p. 35; as quoted in Thylstrup, 2018, pp. 23–24), referring to the manyfold meanings attributed to this phenomenon, by different communities and practices which makes something as complicated as this appear as an coherent assemblage (Thylstrup, 2018, p. 24).
With a departure of the ‘mundane’ data (Pink et al., 2017) being termed as social, the considerations of Langlois, Redden & Elmer (2015) appears plausible, in that
social data is not only a product of computer science; it is also linked to processes of archiving, of remembering and forgetting. From this perspective, social data becomes thick and multidimensional. It does not simply exist to be classified, measured, and correlated, but as a trace of practices that establish cultural and social continuities and discontinuities (Langlois et al., 2015, p. 11).
I, as an individual, has physically been where the data is pointing and experienced that which has been digitized – that’s the only reason the data’s there in the first place. This digitized external storing of my physical presence is being called into existence purely for archival and statistical reasons, as it – on its own – does not enhance what I knew, or contribute anything inherently new, referring to what Richard Rogers (2017) describes as ‘Natively Digital’ (Rogers, 2017).
But looking beyond the phenomenon of ‘mass digitization’, something is happening with the data; the aggregation of data as a whole – a seemingly mundane transformation of data, from a bunch of timestamped coordinates into larger ‘sets’ of data, upon which ever more metadata can be extracted, and mathematical equations can be performed. I recognize this as efforts of ‘datafication’ (Mejias & Couldry, 2019; Van Dijck, 2014).
Bibliography
- European Parliament and Council of European Union. (2016, April 27). Regulation (EU) 2016/679 [Official Journal of the European Union]. Official Journal of the European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&rid=3
- Gitelman, L. (2013). “Raw data” is an oxymoron. The MIT Press.
- Langlois, G., Redden, J., & Elmer, G. (Eds.). (2015). Compromised data: From social media to big data. Bloomsbury Academic, an imprint of Bloomsbury Publishing, Inc.
- Latour, Bruno. (2005). Reassembling the social an introduction to actor-network-theory. OUP.
- Markham, A., & Buchanan, E. (2017). Research Ethics in Context. In M. T. Schäfer & K. van Es (Eds.), The Datafied Society (pp. 201–210). Amsterdam University Press; JSTOR. https://doi.org/10.2307/j.ctt1v2xsqn.19
- Mejias, U. A., & Couldry, N. (2019). Datafication. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1428
- Pink, S., Sumartojo, S., Lupton, D., & Heyes La Bond, C. (2017). Mundane data: The routines, contingencies and accomplishments of digital living. Big Data & Society, 4(1), 12. https://doi.org/10.1177/2053951717700924
- Rogers, R. (2017). Foundations of Digital Methods. In M. T. Schäfer & K. van Es (Eds.), The Datafied Society (pp. 75–94). Amsterdam University Press. https://doi.org/10.1515/9789048531011-008
- Thylstrup, N. B. (2018). The politics of mass digitization. The MIT Press.
- Van Dijck, J. (2014). Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society, 12(2), 197–208. https://doi.org/10.24908/ss.v12i2.4776
- Weinstock, K. (2020). Rearranging the World: Found Objects and the Collection, Pre- and Post-Internet. In B. von Bismarck & B. Meyer-Krahmer (Eds.), Curatorial Things: Cultures of the Curatorial 4 (pp. 229–253). STERNBER PR.
- Wright, P., McCarthy, J., & Meekison, L. (2003). Making Sense of Experience. In M. A. Blythe, K. Overbeeke, A. F. Monk, & P. C. Wright (Eds.), Funology (Vol. 3, pp. 43–53). Springer Netherlands. https://doi.org/10.1007/1-4020-2967-5_5
- Zahavi, D. (2019). Phenomenology: The basics (Original edition). Routledge, Taylor & Francis Group.
The Datafied Self
Retrieving and defining
In order to represent myself through my social media data, I downloaded my Facebook- and Twitter-data, acquired via my GDPR-granted “right of access” (European Parliament and Council of European Union, 2016, art. 15), and then uploaded it into the “Apply Magic Sauce”-project, which is "a non-profit academic research project coordinated by the University of Cambridge Psychometrics Centre" (University of Cambridge: The Psychometrics Centre, n.d.).
The aim of the tool is to predict, or give an educated guess as to how I might be interpreted by algorithms, from digital footprints left behind from my datafied behavior, so that it helps me understand the data which is said to comprise me online. Or as they describe it themselves: "a modest attempt to reverse the trend in Big Data and empower citizens to not only retain control of their data but also derive meaningful insight from it” (University of Cambridge: The Psychometrics Centre, n.d.). It operates solely from data derived from my prior performative actions (for example “Likes”, “Posts”, “Comments”, etc. from Facebook), and not from data which may or may not have been entered manually by me, as for example age. This data can be said to constitute the backbone of my datafied self (Cheney-Lippold, 2017) – the very same basis on which I’m, as an example, being served targeted ads online. I here approach the datafied self as a digital object, being “objects that take shape on a screen or hide in the back end of a computer program, composed of data and metadata regulated by structures or schemas” (Hui, 2016, p. 1).
The reason behind opting for this specific course of action, is an interest in finding out how machines and algorithms perceive my datafied self, as part of better understanding datafication as a phenomenon. My aim for using this tool has been to become confronted with and explore the data-foundation, from which Facebook and Twitter can be said to be creating a datafied version of me, and to question the implied results and impacts of this.
Experiencing and understanding the data behind the datafied self
Having been confronted with the amount of data Facebook is storing about me, I was left in a staggered state of powerlessness. I strongly believed that much of the data I was presented with had already been deleted by me, but it might have ended up in a state of reminiscent of a virtual limbo instead - hidden from plain sight, but still very much in effect. Simply skimming through the thousands and utter thousands of JSON-formatted lines, every single one being a part of my datafied self, created an urge wanting to delete my Facebook-profile entirely. But alas, I have an inherently hard time trusting a “deletion” of my profile, when all of that data is so deeply engrained into the Facebook-ecosystem, distributed onto the platformed web (Helmond, 2015). Especially after experiencing Anne Helmond lecturing on Facebooks platform as a business model, and their ways of routinely and systematically circumventing disclosure of unofficial partnerships and open “data faucets” (Helmond, 2019).
The data which Facebook has amounted about me is not just scary in regard to pure size but turn exponentially more frightening when that same data is inputted into “Apply Magic Sauce”. It outputs predictions, ranging from parameters like age, psychological gender, top 5 personality traits, to political orientation, religious orientation and relationship status. In this case I also inputted my acquired Twitter data, but it didn’t seem to influence the results too much as it mainly takes original tweets into account - of which I have none. It proved to be both creepily correct and alarmingly wrong, at the same time. And as part of my reflections upon this, I’m suddenly not sure which part is the worst of it. It seems that the more accurate the data is, the more effective algorithms have derived it. At the same time, when the algorithms are wrong it has actual effects on me in my social life, for example when I’m served ads or suggested content online – being an issue I bring up in the conclusive part of this assignment.
The Data Selfie
My ‘Data Selfie’ (see appendix “Data Selfie.png”) has been produced directly from the “Apply Magic Sauce” results, being representations of the graphs and aggregated results. I’ve chosen sub-results which I determine to being either completely nonsensical, scarily correct, or just plain wrong. A part of the process of choosing has also been to avoid presenting data which I for some reason don’t feel comfortable sharing. I also commented the assumptions where applicable. Visualizing the results made me think more deeply of them, almost as if I were looking into a mirror of how I’m being perceived online, as I inscribed my reactions directly on the visualization, countering and opposing the results.
I deem these results to be the outcome of juxtaposition, partly on because they’re derived from my datafied self as a digital object, which from the offset I consider as non-representative as vast parts of the actual data comprising it can be said to originate from earlier editions of me as a social individual. The profiling portrayed in the selfie reflects different phases of social life I’ve gone through, since creating a Facebook-profile in 2010. It also encapsulates wildly different personas, as I at this point at least attempt at sharply separating my “personal”-, “professional”- and “board of directors”-personas, as I, at this stage in my life, mainly consider Facebook as a "necessary-evil" for managing groups of people and carry out communication, being quite contextual at that. That also includes the fact that that I consciously haven’t been ‘liking’ arbitrary or non-arbitrary things on Facebook for at least the last few years. Yet the ‘old’ likes, of which I was certain that I’d already removed, is now partaking in defining me as a person, much in the same way as Cheney-Lippold describes, that “our algorithmic identities also regulate us in many different … ways” (Cheney-Lippold, 2017, p. 100)
Conclusive research question
I’ve been tasked to base my conclusion upon a research question, on the basis of this assignment insofar, which I’ve here chosen to infer from the statement, that “[w]ho we algorithmically are is not an intelligible list of adjectives but a mishmash of patterns that, at the moment, is good enough for Google [, Facebook and Twitter]. They’re useful for them but ultimately meaningless for us” (Cheney-Lippold, 2017, p. 148).
My research question is then: Can these datafied selves be said to hold value? I would pose this question on the back of the unresolved questions in the back of my mind, whilst reflecting on the assignment as is; if I consider opting out of the datafication on social media as a form of resistance tactic (Duffy & Chan, 2019), am I then furthering or hindering myself in life, leaving my datafied self representing a much younger me? Am I instead subjecting myself to self-surveillance, at the cost of expressing myself? In order to explore that research question, I would take on the framework of Data Colonialism, as proposed by Couldry & Mejias (2019a, 2019b, 2019c; 2019) building upon decolonialism and the concepts of capitalism and Marxism, as an instantiation of the datafication of our modern society, as discursively articulated by, among others, Cukier & Mayer-Schönberger (2013) and van Dijck (2014).
Bibliography
- Cheney-Lippold, J. (2017). We are data: Algorithms and the making of our digital selves. New York University Press.
- Couldry, N., & Mejias, U. A. (2019a). The costs of connection: How data is colonizing human life and appropriating it for capitalism. Stanford University Press.
- Couldry, N., & Mejias, U. A. (2019b). Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject. Television & New Media, 20(4), 336–349. https://doi.org/10.1177/1527476418796632
- Couldry, N., & Mejias, U. A. (2019c). Making data colonialism liveable: How might data’s social order be regulated? Internet Policy Review, 8(2). https://doi.org/10.14763/2019.2.1411
- Duffy, B. E., & Chan, N. K. (2019). “You never really know who’s looking”: Imagined surveillance across social media platforms. New Media & Society, 21(1), 119–138. https://doi.org/10.1177/1461444818791318
- European Parliament and Council of European Union. (2016, April 27). Regulation (EU) 2016/679 [Official Journal of the European Union]. Official Journal of the European Union. https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&rid=3
- Helmond, A. (2015). The Platformization of the Web: Making Web Data Platform Ready. Social Media + Society, 1(2), 2056305115603080. https://doi.org/10.1177/2056305115603080
- Helmond, A. (2019, October 31). Websites, platforms, and apps: Examining the techno-commercial evolution of digital objects and digital ecosystems [Guest Lecture]. Centre for Internet Studies, Aarhus University | Incuba, Large Auditorium, Åbogade 15, 8200 Aarhus N. https://cfi.au.dk/news/article/artikel/lectures-and-workshop-on-historiographies-of-websites-platforms-and-apps/
- Hui, Y. (2016). On the existence of digital objects. University of Minnesota Press.
- Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work and think. Murray.
- Mejias, U. A., & Couldry, N. (2019). Datafication. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1428
- University of Cambridge: The Psychometrics Centre. (n.d.). Apply Magic Sauce—Prediction API. Apply Magic Sauce. Retrieved October 1, 2020, from https://applymagicsauce.com/about-us
- van Dijck, J. (2014). Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society, 12(2), 197–208. https://doi.org/10.24908/ss.v12i2.4776
Zoteronomics
In order to carry out this assignment 3, we’ve chosen to utilize Zotero as our tool of choice. Considering the Greek “taxis” meaning arrangement, and “nomos” meaning law (‘Taxonomy’, 2020), we replaced the arrangement part with Zotero, as the taxonomic method morphed into ‘zoteronomics’ – a different take on the subject of folksonomy. We here take on a definition of folksonomy, as the collective acts of idiosyncratic grass-roots categorization with lowered bars of entry in regards to taxonomic perfection (Pink, 2005). This improvised ad hoc portmanteau portrays the essence of our efforts exhibited throughout this assignment; we don’t see it as exactly taxonomy (lacking elements of scientific rigidity) nor folksonomy (lacking the broader grass-roots element), defining our work within the specific context of the particularities of Zotero.
Ambitions and decisions
The original task had designated Calibre as the main platform, but we felt that Calibre was standing in our way in different regards. The main argument for leaving Calibre behind was the lack of centralisation, in that if we were not prepared to utilize a third-party serv-er-based version then we would be limited to collaborating in a way where we would all need to synchronize our tagging-efforts in a structured manner in order to avoid doubles and conflicting tagging. This builds on a general ambition to create an approach towards this assignment, of not designating a single member of the group as gatekeeper of the pro-ject as it progresses. By opting for a cloud-based centralised tool, Zotero, we weren’t bound to be physically present in the same room - which would have been considered as highly impractical during a pandemic. This decision also supported a liberal and social approach to the tagging-procedure, as each tag did not need to be critically examined col-lectively before being put in action. This has allowed each member of the group to con-tribute with texts and tags, which were then addressed in plenum when inaccuracies or inconsistencies occurred.
Disembodied metadata
There are extensive differences in the act of maintaining or creating a “library” when working with Zotero rather than Calibre. First and foremost, Zotero is not fixed on having the material or object, itself - be it a book, an article or the like - although it is possible to a certain extent. A Zotero library is rather made up of arbitrary relational data-structures which is built purely by metadata on metadata, meaning that it does not necessarily function as a library in the classical way of being able to sort-to-find objects, but instead as a system for maintaining references. It has been built upon invariable and fixed standards making up the type of an object, which in turn then has the final say in the relevant metadata or taxonomies applying to that specific type. Novels afford and require a different type of metadata than a journal, which is then treated different than a scientific article. We consider this approach to metadata attached to a prototypical and fixated taxonomy as “disembodied metadata”, as the object represented by the metadata can be interdicted from what is stored in Zotero.
Hierarchy
The Collection-function remains the only way to express hierarchy in Zotero, which we have used to differentiate between syllabus and non-syllabus material as child-objects to the parent folder entitled “Curating Data Stud. Grup.” (see appendix file "Zotero Screenshot.png”). Further down in the hierarchy, we discussed whether the most practical solution to further organise the material would be either according to the week in which it was part of the syllabus, or according to the general theme of the given week. The latter allows for greater legibility both for us, in case we cannot remember which week we were introduced to what topic, but also for an outsider who has no knowledge about the texts' relation to the course. Sorting according to week is, however, more practical as our shared experience of attending the course allows us to locate texts quickly from our memory of when it was incorporated in relation to the course as a whole. We ended up opting for sorting by week, with additional tagging of objects ac-cording to the theme of the week.
Dynamic categorization of knowledge
As Katharina Weinstock (2020) describes types of collections as valid art forms, we in our group responded positively towards the description of our ‘curatorial things’ as “[d]etached from their original functions and subjected to processes of de- and recontextualisation, the status of found objects proves malleable” (Weinstock, 2020, p. 234), as the meaning and significance of our objects would constantly change depending on the constellation in which we were attempting to place it. The interrelations were dynamic and under constant negotiation, as the introduction of every new text would shake up the previously established (at rest) relations, thus requiring even more granular levels of categorization.
The introduction of a third sub-genre of literature, in our case ‘Alternative History’, inevitably dictated an urgency towards the introduction of new tags. But also a re-evaluation of prior tags of previously negotiated texts, as this introduction alone then forced a judge-ment of their relation to the theme which had been introduced into the collection; whether or not the content of said texts could feasibly be considered ‘alternative’ or ‘scientifically factual’ in regards to its historicity. With this introduction alone, the entirety of the collec-tion then needed reframing in the light of facts and history being shaped by politics. When labelling alien-related texts as ‘alternative history’, we indirectly deny its value in main-stream knowledge production. The reality of assessing whether sources are credible is, of course, much more complex and subjective, but the affordances of tagging require a binary differentiation; a tag is either or it is not. Thereby, our efforts of zoteronomics can be said to demonstrate a paradigm of what qualifies as scientific research, illustrating how knowledge is constructed and not objective, but instead subject to politics and socially negotiated.
The outcome
The final end-product of our efforts can be examined manually and thoroughly through the exported CSV file (see appendix file “Curating Data Stud.Grup..csv”), where all tags and all possible metadata are machine- and human-readable. This also stands to show the inevitable inaccuracies of our work as is, and has been included in an effort to portray these efforts of zoteronomics as transparent as possible. By creating what we would in this case term a pseudo-folksonomy through the instantiation of our own portmanteau, zoteronomics, we’ve here experienced first-hand how it inductively expresses the politics and shared values of the tagging-process.
Bibliography
- Pink, D. H. (2005, December 11). Folksonomy (Published 2005). The New York Times. https://www.nytimes.com/2005/12/11/magazine/folksonomy.html
- Taxonomy. (2020). In Encyclopedia Britannica. https://www.britannica.com/science/taxonomy
- Weinstock, K. (2020). Rearranging the World: Found Objects and the Collection, Pre- and Post-Internet. In B. von Bismarck & B. Meyer-Krahmer (Eds.), Curatorial Things: Cultures of the Curatorial 4 (pp. 229–253). STERNBER PR
Considering datafied colonialism
In this fourth and final assignment I approach the perception of datafication through a lens of colonialism and exemplify this tendency and its effects through the three individual assignments prior to this.
Datafication and Data Colonialism
Datafication, as articulated by among others Mayer-Schönberger & Cukier (2013), van Dijck (2014) and Mejias & Couldry (2019) can in itself be understood as a colonial pro-cess. Not just metaphorically speaking, when reading that “data is the new oil” (The Economist, 2017), but literally as a new mechanism of data colonialism (Couldry & Mejias, 2019a), which is appropriating human life in order for data to be repeatedly with-drawn from it in order to reach benefits for capitalist interests (Mejias & Couldry, 2019). As Mejias & Couldry and Lauren Scholz have noted, this metaphor can be said to actually “side-step evaluation of any misappropriation or exploitation that might arise from data use” (Mejias & Couldry, 2019, p. 4; Scholz, 2018, p. 2).
When situated within the framework of Data Colonialism, as presented by Couldry and Mejias (2019a, 2019b, 2019c), datafication can be investigated as an extension of the co-loniality of power, defined as the power which lies in its control over “social structures in the four dimensions of authority, economy, gender and sexuality, and knowledge and subjectivity“ (Mohamed et al., 2020, p. 5; Quijano, 2000). Information and communica-tion technologies have, historically, enabled the surveillance and administration of colo-nies in conjunction with propagation of legitimizing narratives in regards to dispossession and extraction, and datafication can be said to continue and extend those functions (Mejias & Couldry, 2019, pp. 6–7); the concept of datafication can in this colonialized lens be said to surpass the scope of ‘just’ a capitalist mechanism - it’s a land grab (Couldry & Mejias, 2019a).
Partaking in datafication
In the first assignment I argued that I’d “digitized my physical presence on my bike in regard to spatiotemporal dimensions”. The data, referring to the CSV-file (See “Appen-dix/Assignment_1/20200924080218-22138-data.csv”), is pointing at exactly where I’ve been at what exact times represented through GPS-coordinates and timestamps.
If I, instead of approaching this as efforts of digitization as I’ve done in the assignment itself, am to address this phenomenon as acts of exactly datafication as described above, I’d argue that I am willfully positioning myself as both subject and object. By supplying ‘raw’ (Gitelman, 2013) nonsensical data, and in turn receiving meaningful meta-data which has been extracted from it, I’ve deliberately entered into what Couldry & Mejias describes as “data relations”, being “ways of interacting with each other and with the world facilitated by digital tools” (Couldry & Mejias, 2019a, p. xiii).
I can be said to voluntarily enact, as Karen Barad would phrase it, agency (Dolphijn & Tuin, 2012) towards shaping my life in a such fashion, so that I am better able to get hold of improved data. This, in an effort to supply Strava with it, in order to myself be able to better understand and communicate my workout sessions. Through this lens, Strava has effectively morphed into a medium of communicating and interacting with others, in a social fashion, and the world around me in my quotidian life as an amateur cyclist – of which I can only be assumed to have partial, or at least limited control over. But more importantly they now hold the power of knowledge, as they are the ones appropriating the data, in the meaning-making from coordinates and timestamps.
If I were to not adhere to Strava’s predefined models of workouts, the workout I’d just performed might as well not exist, and I would be hard pressed to communicate my ef-forts, as Strava is doing this ”codifying of life into datafied realities” (Mejias & Couldry, 2019, p. 4). In this case, I am now working hard, quite physically and literally, in order to supply Strava with sufficient datapoints of my whereabouts at certain times; one could say that I’ve become subject to continued surveillance and monitoring, as my physical efforts are fueling Strava’s capitalistic machine from which they extract profit – if one were to stay within the eluding metaphor of data being modern oil.
The value of the datafied self
As for assignment two, I consider it natural to continue off of the conclusive research question posed; can these datafied selves be said to hold value?
The datafied self, as a digital object, holds value as a currency of information, extracted and traded by corporations - parts of what Couldry & Mejias defines as the social quanti-fication sector; “the industry sector devoted to the development of the infrastructure re-quired for the extraction of profit from human life through data” (Couldry & Mejias, 2019a, p. xiii). I have here approached the value of datafied selves in the consideration of economic value, as the information which the data constitutes is as they, for example, form the basis for ad hoc advertisement on multi-sided market platforms, as noted by Tar-leton Gillespie (2010, 2018).
As economic value can be said to be extracted from these datafied selves, being the reason they are systematically and routinely produced and maintained in the first place, it seems natural to assume some degree of ontological value imbued in those. Although Cheney-Lippold says they’re ultimately meaningless for us, they are still, at the moment of adver-tisement, “good enough for Google” (2017, p. 148), which I interpret as a way of stating, that even though our datafied selves are gibberish and meaningless for us directly, they’re indirectly responsible for how were treated by algorithm-driven platforms as Google, Fa-cebook and Twitter. As long as their perceptions of us work, they don’t need to necessari-ly be correct – they are still very capable of exercising, what Couldry & Mejias take on as, modern colonialism; we need not look further than the Facebook and Cambridge Analyti-ca scandal (Couldry & Mejias, 2019a, p. 3).
The datafied self which I had explored in assignment two was the product of a limited third-party algorithm, the Apply Magic Sauce project, on behalf of a limited set of data. In the real tangible world towards a future, which Couldry & Mejias defines as the Cloud Empire, “a totalizing vision and organization of business in which the dispossession of data colonialism has been naturalized and extended across all social domains” (Couldry & Mejias, 2019a, p. xiii), we experience an amassing information as the transformation of social relations emerges. It can be considered as “the larger outcome of this combined growth of the social quantification sector and data practices right across business and social life” (Couldry & Mejias, 2019a, p. xv), which I will definitely describe as holding value; operating within the system of data colonialism, the datafied self produced through acts of datafication can thus be said to hold value.
Disembodied selves
As for my third assignment I will be pointing at my own partaking of the subjective role of a curatorial process, as opposed to being an object of curation myself which has been the focus up until this point. I originally worked with a group towards creating what in this frame of reference can be approached as a folksonomy, which we termed zoteronomics, as a way of dealing with taxonomies in a social setting, where we were forced to col-laborate in order to reach a meaningful categorization in a Zotero library consisting of what we termed disembodied metadata.
We’d mainly been judging and categorizing, regarding the texts which we hadn’t read, judging from their covers, abstracts, table of contents, perhaps skimming the contents. But nonetheless we’d been categorizing them alongside known texts. As Zotero operates mainly on meta-data, we haven’t been working with ‘data’ per se, but rather operating from and upon metadata comprising the objects of categorization. This experience of working with what we’ve termed disembodied data has mediated my personal stance and view towards the workings and intentions of algorithms and their connections towards decision-making upon my actual lived social life. As these texts has had nothing to say for themselves in regard to our collective evaluation of its constituents, so do I not have much to say in how my online reality is tailored as to how I’m constantly under evaluation – at the end of the day, we’re all just being judged by our covers, being juxtaposed. In this way, we surely are not considered greater than the sum of our parts – we are nothing but the parts. Parts which are designed to fit the information models subjectable to enumera-tion.
I here want to continue this notion of disembodiment, as I argue the same phenomenon is taking place with our datafied selves. Having taken part of this process, I here attempt to draw comparisons with the colonialism-part of Couldry & Mejias’ notion of data colonialism, as by disembodying historical slaves’ potential to generate value, be it extracting coal from the ground or by producing high quality data for Strava, the object is separated from the subject; the same, I argue, applies when our datafied selves becomes disembodied from our social selves, as we are reduced to labor power in a modern Marxist understanding (Couldry & Mejias, 2019a, p. 30), as we enter data relations which are “normalizing the exploitation of human beings through data, just as historic colonialism appropriated territory and resources and ruled subjects for profit” (Couldry & Mejias, 2019b, p. 336).
Final comments
Having approached the prior assignments with this framework of data colonialism, I’m left wondering how these data relations can be said to affect me. I’ve got no real grasp of the actual inner workings on some of these relations I take active part of; data for services. I quote Couldry & Mejias as a provocation directed at myself: “The final turn of the data relations spiral comes when categorizations derived from the processing of your data are applied back onto you, the human subject, from whom the data was derived, whether or not you are identified by name” (Couldry & Mejias, 2019a, pp. 29–30). I do not see myself as the modern equivalent of a Marxist slave (2019a, p. 30), but I now more than ever realize my partaking in the datafication as a whole, and as “[d]ata subjects often attempt to modulate their behavior in order to influence the algorithm that they believe is categorizing them” (2019a, p. 30), I also now recognize my enactment of myself online as exactly that; a data subject believing that algorithms is categorizing it. Am I then a realist or pessimist? I suppose I will find out eventually – in the meantime I will continue opting out of social media, embracing my current situatedness whilst remaining open towards new ones.
Bibliography
- Cheney-Lippold, J. (2017). We are data: Algorithms and the making of our digital selves. New York University Press.
- Couldry, N., & Mejias, U. A. (2019a). The costs of connection: How data is colonizing human life and appropriating it for capitalism. Stanford University Press.
- Couldry, N., & Mejias, U. A. (2019b). Data Colonialism: Rethinking Big Data’s Relation to the Contemporary Subject. Television & New Media, 20(4), 336–349. https://doi.org/10.1177/1527476418796632
- Couldry, N., & Mejias, U. A. (2019c). Making data colonialism liveable: How might data’s social order be regulated? Internet Policy Review, 8(2). https://doi.org/10.14763/2019.2.1411
- Dolphijn, R., & Tuin, I. van der. (2012). New Materialism: Interviews & Cartographies. New Metaphysics. https://doi.org/10.3998/ohp.11515701.0001.001
- Gillespie, T. (2010). The politics of ‘platforms.’ New Media & Society, 12(3), 347–364. https://doi.org/10.1177/1461444809342738
- Gillespie, T. (2018). Regulation of and by Platforms. In J. Burgess, A. Marwick, & T. Poell, The SAGE Handbook of Social Media (pp. 254–278). SAGE Publications Ltd. https://doi.org/10.4135/9781473984066.n15
- Gitelman, L. (2013). “Raw data” is an oxymoron. The MIT Press.
- Mayer-Schönberger, V., & Cukier, K. (2013). Big data: A revolution that will transform how we live, work and think. Murray.
- Mejias, U. A., & Couldry, N. (2019). Datafication. Internet Policy Review, 8(4). https://doi.org/10.14763/2019.4.1428
- Mohamed, S., Png, M.-T., & Isaac, W. (2020). Decolonial AI: Decolonial Theory as Sociotechnical Foresight in Artificial Intelligence. Philosophy & Technology. https://doi.org/10.1007/s13347-020-00405-8
- Quijano, A. (2000). Coloniality of Power and Eurocentrism in Latin America. International Sociology, 15(2), 215–232. https://doi.org/10.1177/0268580900015002005
- Scholz, L. (2018). Big Data is Not Big Oil: The Role of Analogy in the Law of New Technologies. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3252543
- The Economist. (2017, May 6). The world’s most valuable resource is no longer oil, but data. The Economist. https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data
- van Dijck, J. (2014). Datafication, dataism and dataveillance: Big Data between scientific paradigm and ideology. Surveillance & Society, 12(2), 197–208. https://doi.org/10.24908/ss.v12i2.4776