67 skills found · Page 2 of 3
marcgarnica13 / Ml Interpretability European FootballUnderstanding gender differences in professional European football through Machine Learning interpretability and match actions data. This repository contains the full data pipeline implemented for the study *Understanding gender differences in professional European football through Machine Learning interpretability and match actions data*. We evaluated European male, and female football players' main differential features in-match actions data under the assumption of finding significant differences and established patterns between genders. A methodology for unbiased feature extraction and objective analysis is presented based on data integration and machine learning explainability algorithms. Female (1511) and male (2700) data points were collected from event data categorized by game period and player position. Each data point included the main tactical variables supported by research and industry to evaluate and classify football styles and performance. We set up a supervised classification pipeline to predict the gender of each player by looking at their actions in the game. The comparison methodology did not include any qualitative enrichment or subjective analysis to prevent biased data enhancement or gender-related processing. The pipeline had three representative binary classification models; A logic-based Decision Trees, a probabilistic Logistic Regression and a multilevel perceptron Neural Network. Each model tried to draw the differences between male and female data points, and we extracted the results using machine learning explainability methods to understand the underlying mechanics of the models implemented. A good model predicting accuracy was consistent across the different models deployed. ## Installation Install the required python packages ``` pip install -r requirements.txt ``` To handle heterogeneity and performance efficiently, we use PySpark from [Apache Spark](https://spark.apache.org/). PySpark enables an end-user API for Spark jobs. You might want to check how to set up a local or remote Spark cluster in [their documentation](https://spark.apache.org/docs/latest/api/python/index.html). ## Repository structure This repository is organized as follows: - Preprocessed data from the two different data streams is collecting in [the data folder](data/). For the Opta files, it contains the event-based metrics computed from each match of the 2017 Women's Championship and a single file calculating the event-based metrics from the 2016 Men's Championship published [here](https://figshare.com/collections/Soccer_match_event_dataset/4415000/5). Even though we cannot publish the original data source, the two python scripts implemented to homogenize and integrate both data streams into event-based metrics are included in [the data gathering folder](data_gathering/) folder contains the graphical images and media used for the report. - The [data cleaning folder](data_cleaning/) contains descriptor scripts for both data streams and [the final integration](data_cleaning/merger.py) - [Classification](classification/) contains all the Jupyter notebooks for each model present in the experiment as well as some persistent models for testing.
arisu-archive / Bluearchive DataAutomatically updated and decrypted client-side data from Blue Archive game, providing character stats, equipment data, and more for fan tools, research, and analysis.
GylanSalih / GamePriceTrackerThe GamePriceTracker is a Python tool for finding gaming products 🎮 on eBay and tracking their prices over time. It uses search queries and filters 🚫 to gather data, which can be saved to CSV files or an SQLite database for easy analysis and long-term tracking. 📈
bobeezy / Video Store ATM Point Of Sale System“My name is Gregory Guy. I have just purchased a video store, and I need an up to date, GUI driven system to keep track of all the stock in my store. I am not happy with the existing system where everything is done by hand. “Currently, the store operates on a cash basis, although a contract system might be in the pipeline. You will be contacted to do this at a later stage, if necessary. I have a shop next door that sells sweets, drinks, chocolates etc, which runs from a separate cash register. This should not be included in the system you develop. “My store not only stocks videos, but also video machines, as well as DVD’s. At a later stage, I would like to also stock Sony PlayStation games, controls, and possibly other stock items. I want to be able to add these into the stock list with the minimum of hassle, and without calling in the help of a programmer / system designer. “I want to store all transactional information in a database, so that my accounting system can interface with the data. “I charge as follows: New Release: (Video or DVD) R16 Older Stock: (Video or DVD) R12 • Video Machine R30 • Video Machine & any two videos: R50 “When I start stocking PlayStation games and/or consoles (or any other stock items), I would probably want to have a two-tier pricing system for them as well (where I can charge more for newer stock). “It would also be nice to be able to change my prices if and when I need to. I therefore would like the ability to change the price of a ‘New Release’, and that should affect all the videos/DVD’s that fall into that category. The same should apply to the other prices mentioned above. “I have a couple of shop assistants that helps me out, and I would like some security built in so that the assistants cannot get access to my financial and other important data. Functionality: “I obviously need the system to take care of the most important part of the business:- the quick and accurate ‘booking out’ of all stock items. The customer, upon bringing me his/her selection, must be charged accordingly, and the items must be marked as ‘out’. “The system should also allow me to quickly and easily record the returned stock items, as and when they do come in. “Sometimes I also want to credit the customer for something, as the tape/DVD/game might have been damaged before they rented it. The item should then be marked as returned, but as money is then given back to the customer, some sort of record should be kept about this credit transaction so that I can trace which assistant allowed the credit. This will help me minimize fraudulent behaviour where assistants can basically book out resources ‘for free’. “I also want the system to have an advance booking facility, where an existing customer can call in and book a certain video/DVD/other item for a certain day. The system should not allow an item to be booked out twice for a certain date, and if something has been booked out and another customer tries to rent it, at least a warning should be displayed, informing the teller that this is the case. In special cases, such a booking can then be ignored, but most times the teller will inform the customer that s/he cannot have that item for the day. A facility should also be included where the booking can be cancelled at any time, if necessary. (For example, if a customer cancels the booking telephonically, whether it is on the day, or some time in advance). “Although it could be considered part of the accounting package, I would like this system to be able to do a daily summary, where I am presented with total sales (monetary value), total number of rentals (total videos; total DVD’s, total machines,) etc. This can be shown to me either on the screen, or in a printed form. I would like you to decide on the format and content of this screen/report. “Another function that I would like you to incorporate, is that the system should be able to do some analysis for me. Examples of this include: • Top Ten rentals • Top Ten customers • Stock items that have not been rented out in 6 months or more. I would like the above three to be done, but if you can think of other examples, feel free to add them in if you have time. “The system should allow me to add/edit all customer details, and if necessary (not often) delete a customer. Customer details to be stored include, but are not limited to: Name Surname Title I.D. Number Address and Postal Code Telephone Number (Work) Telephone Number (Home) Telephone Number (Mobile) “The system should also allow me to update the information regarding my stock items, for example: • Mark a tape as damaged. • Change a video from a ‘New Release’ to ‘Older Stock’. • Change the category it belongs to. “I have several working, but old machines lying around at my house, and they are already network-capable. I would like you to build some functionality where these machines can be linked to the system you are designing so that they can be used as ‘look-up’ machines. Basically, if a shop assistant is not available, but a customer knows the title of the movie they are looking for, they should be able to go to one of these terminals that I will set up throughout my shop, and enter or select the movie name, and perhaps what they are looking for (video/dvd/game etc). If my shop carries the chosen item, then the system should give them enough information (shelf number/category etc.) to be able to locate the item in the shop. It should also show if an item is unavailable, and when it is due back. If they select an invalid item, they should be informed of this. “The above program should run independently of the main system, and should not access the database directly. The video store will have employees, customers, stock and suppliers. Employees, customers and suppliers related to the video store can be created, deleted or updated. Creating / updating / deleting a customer profile (video store) will be very similar to that of creating / updating / deleting a customer’s account in the banking industry. The stock status also needs to be up to date (available, rented, late or damaged). An ATM will be inside the video store. The ATM is available to both the public and the employees. The ATM can be used for: Bank account balance inquiry, money withdrawal, funds transfer and transaction history (last 5 transactions with dates, time, type of transaction and outcome). The ATM should also cancel a transaction request and swallow a debit card when the user has entered a wrong pin number three times in succession. The ATM can only be used by clients who have existing bank accounts and existing (valid) debit cards. Make provision for situations such as expired debit cards, frozen accounts, insufficient funds, daily withdrawal limit exceeded, etc. The video store works on a cash-only-basis. Customers can withdraw money at the ATM if they don't have cash on them. The ATM is also available to public who only wants to use the ATM (without having to do business with the video store). Payment for stock rented: A Point Of Sale screen (electronic cash register screen) needs to be displayed. The product and the quantity thereof needs to be entered. You can make use of drop boxes if you want to. The system will calculate the total amount due (and the due date back for the products). Enter the cash amount offered by the customer. Calculate the change amount. Update the video store transaction register. Stock returned: Update the electronic system. Make provision for the condition in which the stock items were returned (in a working state or damaged, on time or late - individually). Capture a history record of products rented. Know the value of the stock outside the store. Capture a history record of products currently late. Capture a history record of products damaged. Capture a history record of products currently in store. Calculate the value of stock in-store. Capture a history record of each registered client's rental record. Capture a history record of a client's ATM transactions.
hoangsonww / Graph Data Structure🔍 This repository explores the graph data structure, focusing on its application in analyzing large texts and developing the Word Graph Game. It includes algorithms for text analysis, graph construction, and game logic, offering a comprehensive toolkit for educational and development purposes.
lmassaoy / Got Sentence AnalysisThis project is related to a series of data analysis related to the worldwide famous tv show 'Game of Thrones', more specifically about all sentences said during the 8 seasons
shashankrao / TwitchDataAnalysisExploratory Data Analysis and Visualization for stream/game viewership on Twitch
rndmagtanong / Ph TftA data analysis and machine learning project based on data for the game Teamfight Tactics
SaltFishGC / SteamGameDataAnalysis大数据课设,steam游戏数据分析,结合hadoop+hive+sqoop+mysql+springboot+echarts展示结果。
SarahZ22 / Steam Data AnalysisDU Bootcamp Final Project analyzing data from Steam (cloud-based gaming library). Project includes scraping and retrieving data, data analysis and visualization creation and the implementation of machine learning.
Shaheer-Imam / PUBGAnalysis of the data of one the most played battle royale game known as Player Unknown Battleground(PUBG)
DataForgeOpenAIHub / Steam Sales AnalysisThis repository features an ETL pipeline for retrieving, processing, validating, and ingesting game metadata and sales data from SteamSpy and Steam APIs. Data is stored in a MySQL database on Aiven Cloud and visualized using Tableau dashboards for insightful analysis of gaming trends and sales performance.
pnguenda / Pandas Challenge# Pandas Homework - Pandas, Pandas, Pandas ## Background The data dive continues! Now, it's time to take what you've learned about Python Pandas and apply it to new situations. For this assignment, you'll need to complete **one of two** (not both) Data Challenges. Once again, which challenge you take on is your choice. Just be sure to give it your all -- as the skills you hone will become powerful tools in your data analytics tool belt. ### Before You Begin 1. Create a new repository for this project called `pandas-challenge`. **Do not add this homework to an existing repository**. 2. Clone the new repository to your computer. 3. Inside your local git repository, create a directory for the Pandas Challenge you choose. Use folder names corresponding to the challenges: **HeroesOfPymoli** or **PyCitySchools**. 4. Add your Jupyter notebook to this folder. This will be the main script to run for analysis. 5. Push the above changes to GitHub or GitLab. ## Option 1: Heroes of Pymoli  Congratulations! After a lot of hard work in the data munging mines, you've landed a job as Lead Analyst for an independent gaming company. You've been assigned the task of analyzing the data for their most recent fantasy game Heroes of Pymoli. Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like you to generate a report that breaks down the game's purchasing data into meaningful insights. Your final report should include each of the following: ### Player Count * Total Number of Players ### Purchasing Analysis (Total) * Number of Unique Items * Average Purchase Price * Total Number of Purchases * Total Revenue ### Gender Demographics * Percentage and Count of Male Players * Percentage and Count of Female Players * Percentage and Count of Other / Non-Disclosed ### Purchasing Analysis (Gender) * The below each broken by gender * Purchase Count * Average Purchase Price * Total Purchase Value * Average Purchase Total per Person by Gender ### Age Demographics * The below each broken into bins of 4 years (i.e. <10, 10-14, 15-19, etc.) * Purchase Count * Average Purchase Price * Total Purchase Value * Average Purchase Total per Person by Age Group ### Top Spenders * Identify the the top 5 spenders in the game by total purchase value, then list (in a table): * SN * Purchase Count * Average Purchase Price * Total Purchase Value ### Most Popular Items * Identify the 5 most popular items by purchase count, then list (in a table): * Item ID * Item Name * Purchase Count * Item Price * Total Purchase Value ### Most Profitable Items * Identify the 5 most profitable items by total purchase value, then list (in a table): * Item ID * Item Name * Purchase Count * Item Price * Total Purchase Value As final considerations: * You must use the Pandas Library and the Jupyter Notebook. * You must submit a link to your Jupyter Notebook with the viewable Data Frames. * You must include a written description of three observable trends based on the data. * See [Example Solution](HeroesOfPymoli/HeroesOfPymoli_starter.ipynb) for a reference on expected format. ## Option 2: PyCitySchools  Well done! Having spent years analyzing financial records for big banks, you've finally scratched your idealistic itch and joined the education sector. In your latest role, you've become the Chief Data Scientist for your city's school district. In this capacity, you'll be helping the school board and mayor make strategic decisions regarding future school budgets and priorities. As a first task, you've been asked to analyze the district-wide standardized test results. You'll be given access to every student's math and reading scores, as well as various information on the schools they attend. Your responsibility is to aggregate the data to and showcase obvious trends in school performance. Your final report should include each of the following: ### District Summary * Create a high level snapshot (in table form) of the district's key metrics, including: * Total Schools * Total Students * Total Budget * Average Math Score * Average Reading Score * % Passing Math (The percentage of students that passed math.) * % Passing Reading (The percentage of students that passed reading.) * % Overall Passing (The percentage of students that passed math **and** reading.) ### School Summary * Create an overview table that summarizes key metrics about each school, including: * School Name * School Type * Total Students * Total School Budget * Per Student Budget * Average Math Score * Average Reading Score * % Passing Math (The percentage of students that passed math.) * % Passing Reading (The percentage of students that passed reading.) * % Overall Passing (The percentage of students that passed math **and** reading.) ### Top Performing Schools (By % Overall Passing) * Create a table that highlights the top 5 performing schools based on % Overall Passing. Include: * School Name * School Type * Total Students * Total School Budget * Per Student Budget * Average Math Score * Average Reading Score * % Passing Math (The percentage of students that passed math.) * % Passing Reading (The percentage of students that passed reading.) * % Overall Passing (The percentage of students that passed math **and** reading.) ### Bottom Performing Schools (By % Overall Passing) * Create a table that highlights the bottom 5 performing schools based on % Overall Passing. Include all of the same metrics as above. ### Math Scores by Grade\*\* * Create a table that lists the average Math Score for students of each grade level (9th, 10th, 11th, 12th) at each school. ### Reading Scores by Grade * Create a table that lists the average Reading Score for students of each grade level (9th, 10th, 11th, 12th) at each school. ### Scores by School Spending * Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following: * Average Math Score * Average Reading Score * % Passing Math (The percentage of students that passed math.) * % Passing Reading (The percentage of students that passed reading.) * % Overall Passing (The percentage of students that passed math **and** reading.) ### Scores by School Size * Repeat the above breakdown, but this time group schools based on a reasonable approximation of school size (Small, Medium, Large). ### Scores by School Type * Repeat the above breakdown, but this time group schools based on school type (Charter vs. District). As final considerations: * Use the pandas library and Jupyter Notebook. * You must submit a link to your Jupyter Notebook with the viewable Data Frames. * You must include a written description of at least two observable trends based on the data. * See [Example Solution](PyCitySchools/PyCitySchools_starter.ipynb) for a reference on the expected format. ## Hints and Considerations * These are challenging activities for a number of reasons. For one, these activities will require you to analyze thousands of records. Hacking through the data to look for obvious trends in Excel is just not a feasible option. The size of the data may seem daunting, but pandas will allow you to efficiently parse through it. * Second, these activities will also challenge you by requiring you to learn on your feet. Don't fool yourself into thinking: "I need to study pandas more closely before diving in." Get the basic gist of the library and then _immediately_ get to work. When facing a daunting task, it's easy to think: "I'm just not ready to tackle it yet." But that's the surest way to never succeed. Learning to program requires one to constantly tinker, experiment, and learn on the fly. You are doing exactly the _right_ thing, if you find yourself constantly practicing Google-Fu and diving into documentation. There is just no way (or reason) to try and memorize it all. Online references are available for you to use when you need them. So use them! * Take each of these tasks one at a time. Begin your work, answering the basic questions: "How do I import the data?" "How do I convert the data into a DataFrame?" "How do I build the first table?" Don't get intimidated by the number of asks. Many of them are repetitive in nature with just a few tweaks. Be persistent and creative! * Expect these exercises to take time! Don't get discouraged if you find yourself spending hours initially with little progress. Force yourself to deal with the discomfort of not knowing and forge ahead. Consider these hours an investment in your future! * As always, feel encouraged to work in groups and get help from your TAs and Instructor. Just remember, true success comes from mastery and _not_ a completed homework assignment. So challenge yourself to truly succeed! ### Copyright Trilogy Education Services © 2019. All Rights Reserved.
talkpython / Polars For Power Users CoursePolars for Power Users: Transform Your Data Analysis Game Course
gilha / Nba 2k23 Etl And Data AnalysisETL + Exploratory Data Analysis of NBA 2K23 Game's data
MuxaJlbl4 / Game Data MiningPersonal collection of interesting info, extracted from games by reverse engineering and data mining
tiwariar7 / PUBG AnalyzerIs a dynamic performance analysis tool designed for competitive gamers. By analyzing gameplay data, it calculates key metrics, assesses player performance, and provides actionable recommendations to improve gameplay strategies.
Choeyanggg / Data Analysis Of VideoGame SalesNo description available
Smartloe / Data Analysis Of Tokyo Olympic Games本项目旨在利用数据分析挖掘技术,探索并清洗东京奥运会奖牌榜和各奖牌赛程记录 等数据,分析 2020 东京奥运会奖牌分布情况和奖牌变化情况,并且利用 Pyecharts 将分析结 果可视化。
gryAI / Baccarat Game Data Generator APIA publicly available API designed to generate simulated Baccarat game data. This API is intended for aspiring data engineers, data analysts, and data scientist who are seeking practice with real-time data, statistical analysis, and modeling.