Output live football scores (real time) via Bluetooth - bluetooth

I have a personal project in mind but I don't know where to start.
For example: Barcelona - Real Madrid.
When one of the teams scores a goal, I want to output this via my Bluetooth speaker. for example:
Barcelona scored 1 goal. it's 1-0 now.

I have a rough idea. Try using python to scrape the live score from the internet. Then when that score is scraped, you can use a Text to speech package and connect that to your speaker. You can program the speaker to say different teams scores at different times.
Recap:
Scrape online scores with python, use text to speech on the scraped scores.

Related

Subset Text Similarity Score - How to detect a piece of small text that is very similar to subset of a much bigger text

There are many ways to detect similarity between 2 texts like Jaccard Index, TFIDF cosine similarity or sentence embedding. However all of them are refering to use case of 2 texts which are fully to be compared.
Here, I don't know how to call it but I add Subset Text Similarity Score, is to detect/calculate a score to see whether a small text is an extract from a bigger text.
For example, there is a big text (from a news)
Google Stadia, the company's cloud gaming service, will shut down on January 18 after the game failed to gain the traction with users the company had hoped for.
The cloud gaming service debuted through a closed beta in October 2018 and publicly launched in November 2019.
In spite of the fact that users are about to lose access to all of their titles and save on Stadia, many publishers share ways to keep playing their games on other platforms, reports The Verge.
Moreover, Google is also refunding all Stadia hardware purchased through the Google Store as well as all the games and add-on content purchased from the Stadia store.
the objective of the subset text similarity is to detect whether this small text is a subset (extract) from the bigger text above. The small text can have sentences not in the same order as the bigger text to be compared.
Example small text
On Stadia, users are will lose access to all of their titles and saves. all Stadia hardware purchased through the Google Store will be refunded.
For a small text above, the subset similarity score should be very high.
Is there some package or NLP method that can do this?

How to calculate a new player winning probability with expert level player

Let's say I have a online game where users can play each other. I have list of players, all players categorized as experts, average, below average. A new player wants to play this game, and the system choose expert level player against new player. what would be the winning probability of the new player?
ex: expert player data looks as:
total games: 45,
win: 30,
draw: 10,
loss: 5
Is it possible to do probability between new player vs expert player? If yes, what statistics can be considered?
Thanks in advance
I would take a subset of the data which corresponds to expert vs. average games
If the expert in question has enough data (do a binomial test to check if his win rate is statistically significant), then you can rely on his/her data alone. Eg If this particular expert has played 100 games against average players and won 80, that's significant, and you can say he/she has a 80% change of winning
If the expert doesn't have enough data (ie to be significant), you can combine data from ALL expert vs. average games to compensate, although this raises another question: how did this player get awarded the expert rank if we believe there aren't enough games to determine this?
I believe the main issue here is that you reduce all possible skill levels to 3 (expert vs. average vs. below average), whereas in reality there might be more granularity to how good or bad people are, thus your model might be over simplistic. Introducing more levels would help address that issue (eg 5 levels doesn't seem like too much for people to grasp and might already perform better than 3). Alternatively you can also try to calculate probabilities based on more detailed properties (eg win rate, # days user has been playing, age, gender...) even if it's something you only keep to yourself (ie players only see the n ranking levels but you have more detailed properties to do your calculations).

Creating radar image from web api data

To get familiar with front-end web development, I'm creating a weather app. Most of the tutorials I found display the temperature, humidity, chance of rain, etc.
Looking at the Dark Sky API, I see the "Time Machine Request" returns observed weather conditions, and the response contains a 'precipIntensity' field: The intensity (in inches of liquid water per hour) of precipitation occurring at the given time. This value is conditional on probability (that is, assuming any precipitation occurs at all).
So, it made me wonder about creating a 'radar image' of precipitation intensity?
Assuming other weather apis are similar, is generating a radar image of precipitation as straightforward as:
Create a grid of latitude/longitude coordinates.
Submit a request for weather data for each coordinate.
Build a color-coded grid of received precipitation intensity values and smooth between them.
Or would that be considered a misuse of the data?
Thanks,
Mike
This would most likely end up in a very low resolution product. I will explain.
Weather observations come in from input sources ranging from mesonet stations, airports, and other programs like the "citizen weather observer" program. All of these thousands of inputs are input into the NOAA MADIS system, a centralized server that stores all observations. The companies that generate the API's pull the data from MADIS.
The problem with the observed conditions is twofold : one is that the stations are highly clustered in urban areas. In Texas, for example - there are 100's of stations in Central TX near the cities of San Antonio and Austin, but 100 miles west there is essentially nothing. To generate a radar image using this method would involve extreme interpolation- and...
The second problem is observation time. The input from rain gauges are many times delayed several minutes to an hour or more. This would give inaccurate data.
If you wanted a gridded system, the best answer would be to use MRMS (multi-radar-multi-sensor) data from the NWS. It is not an API. These are .grib files that must be downloaded and processed. This is the live viewer and if you want to work on the data itself you can use the NOAA Weather Climate Toolkit to view and/or process by GUI or batch process (You can export to geoTIF and colorize it with GDAL tools). The actual MRMS data is located here and for the basic usage you are looking for, you could use the latest data in the "MergedReflectivityComposite" folder. (That would be how other radar apps show rain.) If you want actual precip intensity, check the "PrecipRate" folder.
For anything else except radar (warning polygons, etc) the NWS has an API that is located here.
If you have other questions, I will be happy to help.

Comparing voice input with existing audio sources

I'm currently working on creating a recipe for a script that would compare audio input with existing audio sources and return a match is any.
The idea is that the voice input would not be convertible to text. Those would be vocals such as dog ("woof") or cat ("meow") sound inputs.
In the end, I would like the script to conclude whether the input was a cat or dog sound, or none of the two.
I understand that It would require to pre process the sound input (low-pass; noise reduction etc), then do a spectrum analysis of the sound before comparing this to the existing spectrum analysis from the DB but I don't know where to start.
Are there any libraries for this kind of small project that could help?
How do I compare spectrum analysis?
How does spectrum analysis comparison take into account the possibility that two different people could make the same meow sound? Does it take into account a match up to a specific pourcentage?
Thanks for any guidance regarding this matter.

Text classification

I have a trivial understanding of NLP so please keep things basic.
I would like to run some PDFs at work through a keyword extractor/classifier and build a taxonomy - in the hope of delivering some business intelligence.
For example, given a few thousand PDFs to mine I would like to determine the markets they apply to (we serve about 5 major industries with each one having several minor industries. Each industry and sub-industry has a specific market and in most cases those deal with OEMs, which in turn deal models, which further sub divide into component parts, etc.
I would love to crunch these PDFs into a semi-structured (more a graph actually) output like:
Aerospace
Manufacturing
Repair
PT Support
M250
C20
C18
Distribution
Can text classifiers do that? Is this too specific? How do you train a system like this that C18 is a "model" of "manufacturer" Rolls Royce of the M250 series and "PT SUPPORT" is a sub-component?
I could build this data manually but would take forever...
Is there a way I could use a text classifier framework and build something more efficiently than regex and python?
Just looking for ideas at this point... Watched a few tutorials on R and python libs but they didn't sound quite like what I am looking for.
Ok lets break your problem into small sub-problems first , i will break the task as
Read PDF and extract data and meta data from them - take a look at Apache Tikka lib
Any classifier to be more effective need training data - Create training data for text classifier
Then apply any suitable classifier algo .
You can also have look at Carrot2 clustering algo , it will automatically analyse the data and group pdf into different categories.

Resources