Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I need to categorize domains into different categories that offer the best use of a domain name.
Like categorizing 'gamez.com' as a gaming portal.
Is there any service that offers classification of domain name like Sedo is doing?
All the systems that I am aware of manage a list, somewhat by hand.
Using a web-filtering proxies (e.g. WebSense) for inspiration, you could scan for keywords contained in the domain name, or in web content/meta tags at the specified location. However, there are always items that seem to match more than one category, or no category, and these need deeper analysis.
Eventually you end up building your own fairly complex logic, maintaining a list by hand, or buying a list from someone else.
SimilarWeb API does that.
It's really straight forward and returns a given domain's category from a URL.
If these are new domains or not used domains. There isn't any information on the internet yet. You can make use of a mechanical turk, like: https://www.mturk.com/ .
You could post an task with your list and possible categories. The downside is this will cost you money.
If these are domains that are already in use you can use a bookmark service as xmarks or delicious. Retrieve all public bookmarks from that domain and count the number of tags. The most used tags will indicate a category of the domain.
I think https://tools.zvelo.com/ has pretty accurate categorization.
For example gamez.com comes back with Hobbies and Interests as IAB-TIER-1 and Video & Computer Games as IAB-TIER-2.
It also provides information if the domain is brand-safe, is it malicious or illegal content?
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm pretty new on search engines and pretty newbie on machine learning. But I wanted to know if there is a way to combine functionalities of search engines like elasticsearch or Apache Solr and machine learning project like Apache Mahout, H2O or PredictionIO.
For exemple, if you work on a travel website where you can search for a destination. You start type "au", so the first suggestions are "AUstria", "AUstralia", "mAUrice island", "mAUritania"... etc... This is typically what elasticsearch can do.
But you know that this user has already travelled on Mauritania three times, so you want that Mauritania goes on the first place of suggestions. And I guess that's typically what machine learning can do.
Is there bridges between this two type of technologies ? Can machine learning ensure the work of search engine efficiently ?
I'm open to all answers, regardless of the technologies used. If you have ever experienced this type of problems, my ears are wide open :-)
Thank you
Your question is very general in nature- so my answer will have to be the same.
Consider a recommender framework such as the one in Apache Mahout correlated co-occurance. Unlike the vanilla spark recommender, this implementation allows for multiple types of actions, such as viewed a web site, booked a trip their before, demographic information, etc.
Now you would calculate the recommendations for each user at whatever interval. Recommendations being based on multiple criteria and what other people similar to this user has done. Consider your 'items' in this case to be every destination in the world. So we now have every possible destination ranked for each user.
It is then a trivial extension to index elastic search by user/the ordered list of that users recommended destinations.
For example, we have a user who has visited Berlin, looked at several hotels in Vienna, and is from Romainia. When the user types in "au", we would expect to see "Austria" come up in the results much higher than 'Austrailia'
Per the comments and down votes- you probably should have either A) asked a more specific programming question or B) asked this question on another forum such as Data Science Stack Exchange, fyi
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to build ... something (website? app? tool of some variety?) that searches other sites -- such as Amazon -- for specific items and then lists whether or not those items exist. Ideally it could also pull prices, but that's secondary.
I'd like to be able to enter a (very specific, an identification number) search term into the thing that I build and then have the thing return whether or not the searched item exists on the sites that it checks (a predetermined list). I'd also like it to take a list of ID numbers and search them all at once.
I have no idea where to begin. Can anyone point me in the right direction? What do I need to learn to make this happen?
You will need to learn a few key languages in order to start working on a program like this.
PHP: you need a server side language to skim the site
Javascript: For the input on the users side
HTML: to implement the javascript
Once you learn the basics, search stackoverflow for specific questions relating to a specific problem.
This is certainly a too broad question, but as OP asks to point in some direction here are few suggesstions-
Well this seems to be a big projects. You'll need to find if there is some official api given the other sites from where you want to fetch the product info, if yes use the api to retrieve the product info or else use web scraping where you retrieve the data by parsing the page and storing into your local database.
Amazon provides EC2 instances, where you can hourly rent specific configuration server as needed e.g. Linux with apache/mysql/php, or linux/java.
Amazon has a set of other tools like the S3 storage where you can host your images/docs/video and link them on the site.
Hope this helps in someway.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 9 years ago.
Improve this question
I wanted to know that sometimes when I search for something on google it shows some results(website links), but it also shows some important links on that website.
I wanted to know that is it a feature of the website or Google uses something to find those main links of the website? Is it related to search engine optimization?
You probably mean Google’s sitelinks.
We only show sitelinks for results when we think they'll be useful to the user. If the structure of your site doesn't allow our algorithms to find good sitelinks, or we don't think that the sitelinks for your site are relevant for the user's query, we won't show them.
(See this [closed] question.)
It has to do with click-through rates of those links. For example, Googling 'Amazon' brings up amazon.com, with a handful of links below: Books, Kindle e-Books, Music, etc.
These are obviously popular categories on Amazon, and Google tracks where users click, then uses that data to make serps more relevant.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm writing a small helper utility for obscure software that is used at a local shop. Basically, I would like to know if anyone searches for anything associated with that software and if publishing my work on the Internet would make any sense. I entered the name of the software into Google Trends, but my terms "do not have enough search volume to show graphs" despite the fact that Google lists 250,000 results for the software name, or 35,000 if I explicitly remove terms such as serial and warez from the search.
Does anyone know of alternatives to Google Trends? Or of another way to find out if people search for a particular keyword?
I found what I was looking for.
Google AdWords Keyword Tool
Yahoo Clues is a service similar to Google Trends. But I don't think it's as effective for any category that is non-entertainment.
If you don't get an answer here, another place to ask might be The Business of Software.
Google Trends was also telling me there wasn't enough data for my query. I found Google Insights to do job nicely. And unlike the AdWords tool mentioned in the author's answer, it actually shows a trend.
Here's an example which shows the emergence of 3 terms with too low of volume to show up on Trends: #bigdata, #datascientist & #datajournalism.
Here's a related SO question.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 months ago.
Improve this question
Where can i find database, phone numbers masks for mobile operators, or just web site, where i can detect mobile operator by phone number?
I know this out of date, but you would need to do something called a HLR lookup via a sms gateway, InfoBip for example.
In the UK, you cannot do this. Numbers can be ported from operator to operator, it's all very fluid. Each operator will know how to route these numbers between themselves, but they don't expose that routing to outside parties.
Not in Australia - mobile numbers might be handed to operators in blocks, but they belong to the user and can be ported to any carrier the user chooses to use.
Of course, there are still ways to look up an individual number and find out which carrier it's on - there have to be, in order for the call to be routed to the appropriate carrier. You're not going to get access to that without investing a significant amount of money to set up a telco though.
All of this is almost certainly irrelevant to you as you didn't say you were specifically interested in australia; but then again, you didn't say you weren't interested in Australia either.
I found such a service which seems to provide an API for lookup of the operator in many countries around the world: http://numberportabilitylookup.com/