Search

Search Results

University of Ottawa Dataverse Logo
Borealis
Van der Kolk, Jarno; Darveau, Peter; Tayler, Felicity 2024-10-10 Nous verrons dans ce tutoriel comment détecter des modèles à l’aide du classificateur bayésien naïf, une technique d’apprentissage-machine efficace pour détecter certains modèles et prévoir les dépendances au sein de votre jeu de données. Nous réexaminerons dans la première partie de ce tutoriel le jeu de données Iris utilisé dans le tutoriel précédent pour apprendre à utiliser le classificateur bayésien naïf. Nous appliquerons par la suite vos nouvelles connaissances pour déceler les pourriels parmi vos messages textes (SMS), de manière à identifier les messages que vous ne désirerez pas lire. Le jeu de données que nous utiliserons s’agit d’un jeu de données de source libre du Référentiel d’apprentissage-machine UCI. Nous examinerons ensuite la classification multi-étiquettes via le jeu de données CMU que nous avons utilisé antérieurement pour le classificateur des plus proches voisins. Enfin, nous vous donnerons un exemple d’utilisation non aboutie du classificateur bayésien et vous expliquerons pourquoi cela n’a pas fonctionné. <br/><br/>The tutorial revisits the Iris flower dataset to introduce the basic steps of working with the Naive Bayes Classifier. It then applies the classifier to detect spam in SMS messages using the SMS Spam collection dataset from the UCI Machine Learning Repository, and performs multi-label classification using the CMU book dataset. The tutorial also presents a scenario where the Naive Bayes Classifier fails, providing an explanation for the failure. By the end of this tutorial, participants will have a solid understanding of the Naive Bayes classifier, be able to split data into training and testing sets, make predictions, evaluate classifier performance, identify spam, classify books, train a Gaussian Naive Bayes classifier for single or multiple labels, and utilize imputation techniques for handling missing data.
University of Ottawa Dataverse Logo
Borealis
Macarios, Jasmin; Tayler, Felicity 2024-10-08 The genre of oral history tapes is a powerful form of mediated oral transmission of knowledge between geographically dispersed communities and generations. The act of listening to recordings of stories of survival and joy, forms affective bonds akin to kinship networks for listeners who identify with marginalized sexualities or genders (Chenier 2014). This talk will explore the case of the Lesbian Organization of Toronto (LOOT) oral history tapes as an example of queer intergenerational memory transmission. The LOOT Oral History Project interview tapes were recorded during 1988-1990 by sociologist Becki Ross and are extensively quoted in The House that Jill Built: A Lesbian Nation in Formation. Each of the interviews provides a unique perspective on LOOT’s four-year existence (1976-1980) and the politics of a particular Lesbian community located in Toronto (Ross 1995), that overlaps with poetic and publishing communities in the Spoken Web network. The stakes of intellectual property and privacy such as these tapes are perceived as high risk in digital environments, particularly when working with analogue materials that pre-date digitization and the Internet. While approaching the digitization of the LOOT tapes, we have taken into mind the historical, structural and harm that privacy law and intellectual property law now continue to perpetuate in the environment of digital archives. This talk will explore our approach through a framework of archival temporalities (Caswell 2021), as we work to navigate intergenerational contexts and reconcile them with our own contexts and identities as queer researchers.
University of Ottawa Dataverse Logo
Borealis
Van der Kolk, Jarno; Darveau, Peter; Tayler, Felicity 2024-10-10 <p>Ce tutoriel est conçu pour optimiser la préparation des données pour l'apprentissage automatique, avec un focus spécifique sur la prédiction des schémas de circulation des vélos en fonction des conditions météorologiques. Il comprend un résumé des objectifs d'apprentissage, une section spécifique qui décrit les exigences nécessaires pour compléter le tutoriel, et une section sur les pratiques recommandées pour la gestion des données de recherche (GDR). Le tutoriel utilise la régression linéaire, un modèle d'apprentissage automatique simple, pour faire des prédictions basées sur les données d'entrée. Les données proviennent des données de comptage des vélos d'Ottawa et des données météorologiques historiques.</p> <p>This tutorial is designed to optimize data preparation for machine learning, with a specific focus on predicting bike traffic patterns based on weather conditions. It includes a summary of the learning goals, a specific section that outlines the necessary requirements for completing the tutorial, and a section on the recommended practices for Research Data Management (RDM). The tutorial employs Linear Regression, a straightforward machine learning model, to make predictions based on the input data. The data is sourced from Ottawa’s bike count data and historical weather data.</p>
University of Ottawa Dataverse Logo
Borealis
Van der Kolk, Jarno; Darveau, Peter; Tayler, Felicity 2024-10-10 Nous verrons dans ce tutoriel comment un algorithme d’arbre décisionnel permet de faire des prédictions. Ce tutoriel comporte trois carnets, chacun traitant d’une application différente des arbres décisionnels et de l’algorithme connexe de forêt aléatoire. Une forêt aléatoire est en fait un regroupement d’arbres décisionnels. <br></br> Dans le Carnet 1 - Arbres de décision, nous analyserons un jeu de données relatives aux iris, un genre de fleur regroupant environ 300 espèces de plantes de la famille des iridacées, très populaires dans les jardins, peu importe la zone thermique. Il s’agit de plusieurs espèces végétales dont l’identification manuelle prendrait beaucoup de temps. Nous souhaiterions donc générer un modèle qui nous aidera à prédire la classe à laquelle une iris appartient, en fonction de ses caractéristiques. <br></br> Dans le Carnet 2, nous élargirons le modèle d’arbre décisionnel au modèle de forêt aléatoire, qui regroupe en fait plusieurs arbres décisionnels traitant différents blocs de données. Le modèle cumule les résultats des différents arbres pour faire des prévisions plus précises, le résultat généré par la forêt aléatoire correspondant à la classe sélectionnée par le plus grand nombre d’arbres décisionnels. <br></br> Dans le carnet 3 nous verrons dans ce tutoriel comment les classificateurs à forêt aléatoire peuvent améliorer les prédictions face à des jeux de données bruitées alors que de simples modèles d’apprentissage-machine, comme celui de régression linéaire, ne sont pas à la hauteur. Comme jeu de données bruitées, nous utiliserons le jeu de données épurées dont nous nous sommes servis pour le tout premier tutoriel sur la préparation de données à des fins d’apprentissage-machine. <br><br><b>This machine learning training</b> explores the dual nature of Decision Trees, demonstrating a fascinating interplay between human intuition and mathematical optimization. It delves into how decision trees use simple, hierarchical branching based on key features, mirroring how our minds categorize objects using decisive traits. The training comprises three notebooks, each focusing on distinct applications of the Decision Tree and its associated Random Forest algorithm.
University of Ottawa Dataverse Logo
Borealis
Van der Kolk, Jarno; Darveau, Peter; Tayler, Felicity 2024-10-10 This machine learning training explores the power and versatility of Support Vector Machines (SVMs), a class of models that employ mathematical optimization to find maximum margin hyperplanes for classification. The training comprises four notebooks: Regularization, Kernels, Deeper Understanding, and Noise Reduction.
University of Ottawa Dataverse Logo
Borealis
Van der Kolk, Jarno; Darveau, Peter; Tayler, Felicity 2024-10-25 Cette série de tutoriels entend combler trois lacunes au niveau de la compréhension de l’IA et des méthodologies d’apprentissage-machine : <li>Proposer une introduction aux modèles d’intelligence artificielle et d’apprentissage-machine.</li> <li>Préparer les données requises par ces modèles.</li> <li>Intégrer les pratiques de gestion des données de recherche (GDR) aux méthodologies fondées sur l’IA et l’apprentissage-machine</li> <br></br> This tutorial series addresses three key gaps in understanding AI and machine learning (ML) methodologies: <li> Providing an introduction to AI and ML models,</li> <li>Preparing data for these models, and</li> <li>Incorporating research data management (RDM) practices into AI and ML-enabled methodologies.</li>
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Mitchell, Marjorie 2021-06-01 Recording of the making research data public panel where DH scholars and digital asset management specialists presented case studies. The panelists were: • Constance Crompton (University of Ottawa), • Karis Shearer (University of British Columbia Okanagan Campus), • Matthew Lincoln (Carnegie Mellon University), • Mikhel Proulx (Concordia University and Indigenous Digital Art Archive)
University of Ottawa Dataverse Logo
Borealis
Macarios, Jasmin; Tayler, Felicity 2023-10-25 This paper was presented at the SpokenWeb Symposium 2023: Reverb: Echo-Locations of Sound and Space. Is metadata a “literary audio event?” The Lesbian Organization of Toronto (LOOT) Oral History Tapes were discussed as a contribution to SpokenWeb, because they enhance 2SLGBTQIA+ content in the metadata from literary events. The oral history tapes of this collection are restricted; therefore, the main goal of this work is not necessarily to make the files public, but to develop a methods approach to working with descriptive metadata of sensitive files. We hope the project will serve as a case study of ethical data practices that can then be shared with 2SLGBTQIA+ community members, wider researcher communities, archivists, and librarians about how to work with the nuances of digitization and access to sensitive material in historical context. The LOOT Oral History Project interview tapes were recorded during 1988-1990 by sociologist Becki Ross and are extensively quoted in The House that Jill Built. Each of the interviews provides a unique perspective on LOOT’s four-year existence (1976-1980) and the politics of a particular Lesbian community located in Toronto (Ross 1995), that overlaps with poetic and publishing communities in the Spoken Web network. What are the ethics of making these overlaps visible through metadata work, even if the content of the tapes must remain restricted? This paper details the technical approach used to digitize and describe these analogue audio tapes according to archival standards and to the Spoken Web metadata schema. A Data Management Plan was key to documenting our procedures for respecting ethical guidelines (Morissette et al 2021).
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Macarios, Jasmin 2023-10-25 This paper was presented at the SpokenWeb Archives Research Workshop (Sept. 2023) The LOOT Oral History Project interview tapes were recorded during 1988-1990 by sociologist Becki Ross and are extensively quoted in The House that Jill Built. Each of the interviews provides a unique perspective on LOOT’s four-year existence (1976-1980) and the politics of a particular Lesbian community located in Toronto (Ross 1995), that overlaps with poetic and publishing communities in the Spoken Web network. What are the ethics of making these overlaps visible through metadata work, even if the content of the tapes must remain restricted? The genre of oral history tapes is a powerful form of mediated oral transmission of knowledge between geographically dispersed communities and generations. The act of listening to recordings of stories of survival and joy, forms affective bonds akin to kinship networks for listeners who identify with marginalized sexualities or genders (Chenier 2014). But what does it mean when the ethical choice is to put a hold on listening to the tapes until we sort out permissions and donor agreements to institutional archives? El Chenier describes this limbo as a “return to the closet” that queer communities have faced within digital archiving spaces. Because the stakes of intellectual property and privacy are perceived as high risk in digital environments, particularly when working with analogue materials that pre-date digitization and the Internet.
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity 2019-11-27 I need poems. Send poems please. All kinds of poems. #pleafrominsidedataverse
University of Ottawa Dataverse Logo
Borealis
Bah, Fatoumata; Tayler, Felicity 2022-10-05 Slide deck and accompanying worksheet for a session presented at the Compute Ontario Summer School 2022. This session aimed to help attendees understand their data workflow, the importance of documenting it, and the FAIR principles for curating data with a view toward sharing it with others. The worksheet pertains to a case study of a bilingual historian who uses transcription from 19th c general store notebooks into excel sheets, and how she has published these tabular data and textual hybrids in Dataverse.
University of Ottawa Dataverse Logo
Borealis
Bah, Fatoumata; Tayler, Felicity 2022-12-14 Slide deck for a workshop on research data management best practices, using a Canadian BIPOC artists rolodex as an example.
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Mitchell, Marjorie 2021-05-19 The data-flow diagram template is a way of representing a flow of data through your DH methodologies, it gathers information about the material and immaterial outputs and inputs of each entity and the process itself. Specific operations based on the data can be represented by a flowchart. This template was used as part of the "Making research data public: workshopping data curation for digital humanities projects workshop".
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Michell, Marjorie; Ripp, Chantal; Dangoisse, Pascale 2022-05-18 This researcher-centered Primer was collaboratively authored by over 30 Digital Humanities researchers, research assistants and data professionals. It serves as an overview of the different aspects of data curation and management best practices for the Digital Humanities. The files deposited here - both the English and French version - are drafts.
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Mitchell, Marjorie 2021-05-26 This workshop covered all areas of data management including IP permissions and informed consent, data collection, metadata standards, file sharing, preservation (data deposit), and data sharing through the open data spectrum of access. Participants worked on their own data curation challenges in break-out sessions and with reference to case study examples presented by a panel of DH scholars and digital asset management specialists. The workshop draws significantly upon cases and RDM processes developed and in continued development, across the SpokenWeb research network.
University of Ottawa Dataverse Logo
Borealis
Tayler, Felicity; Mitchell, Marjorie 2021-06-03 Dubbed recording of the making research data public panel where DH scholars and digital asset management specialists presented case studies. The panelists were: • Constance Crompton (University of Ottawa), • Karis Shearer (University of British Columbia Okanagan Campus), • Matthew Lincoln (Carnegie Mellon University), • Mikhel Proulx (Concordia University and Indigenous Digital Art Archive)

Map search instructions

1.Turn on the map filter by clicking the “Limit by map area” toggle.
2.Move the map to display your area of interest. Holding the shift key and clicking to draw a box allows for zooming in on a specific area. Search results change as the map moves.
3.Access a record by clicking on an item in the search results or by clicking on a location pin and the linked record title.
Note: Clusters are intended to provide a visual preview of data location. Because there is a maximum of 50 records displayed on the map, they may not be a completely accurate reflection of the total number of search results.