Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Assume a site like https://www.wood-database.com/wood-finder/ (our working example). Each page of it has data on a wood species. Assuming we need to sort the woods by a ratio of its data, for example hardness/weight, the site's tools aren't very useful.
What would be useful, though, is passing that data into an excel, which could trivially calculate the ratio and sort.
What ways are there to automatically fill that sheet out? What other tools besides excel could do it?
You should have a look at python, it's perfectly fit for the job. You could use the request library together with beatifulsoup to begin with, then load all data into a Pandas Dataframe and simply export it to excel (standard funtionality of Pandas).
If you really want to scrape the site thoroughly, you could consider using Scrapy (https://scrapy.org/)
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 days ago.
Improve this question
Is there any open source project or free avaliable source where I can query a webpage's category type (like https://www.trustedsource.org/en/feedback/url). I have more than 200K webpage in my dataset.
To me it looks like more of a classification problem which is suitable for Machine Learning. For this purpose you can make your model in popular ML frameworks (such as Keras/TensorFlow and PyTorch) or search for available ones on internet and use your dataset to do a transfer learning.
I could find a project on GitHub (link) that can be a good starting point.
Hi today and happy weekend!
that's interesting to know if a category is used as category pages, since google shows up multiple spots of one domain when it has category pages.
Examples:
danlok(com)
best example to see: bloomberg....
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Do you know of a tool that allows you to code compare two excel files (file1.xlsm vs. file2.xlsm) to see the code difference in the Visual Basic code?
I know that there is a tool called Beyond Compare. This lets you compare the contents of the sheets inside the two excel files, but I can't find a tool that compares the contents of the visual basic code.
Thanks
Neo
Ok I figured it out. Beyond Compare actually has downloadable plug-ins to let you compare VBA between 2 excel files. :)
http://www.scootersoftware.com/support.php?zz=kb_moreformats_alt
If you don't have a copy of Beyond Compare handy, try VbaDiff. I built it for the purpose of comparing VBA code as you describe.
www.technicana.com/products/vbadiff.html
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am doing a project in news classification. Basically the system will classifying news articles based on the pre-defined topic (e.g. sports, politic, international). To build the system, I need free data sets for training the system.
So far, after few hours googling and links from here the only suitable data sets I could find is this. While this will hopefully enough, I think I will try to find more.
Note that the data sets I want:
Contains full news articles, not just title
Is in English
In .txt format,not XML or db
Can anybody help me?
Have you tried to use Reuters21578? It is the most common dataset for text classification. It is formated in SGML, but it is quite simple to parse and transform to a txt format.
You can build it, you can write a Python/Perl/PHP script where you run a search, then when you find the answers you can isolate the attributes with regex... I think is the best option. Is not easy but should be fun, finally you can share this dataset with us.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I have a hugh graph structure like wikipedia. I have the adjacency list for the graph as it is very sparse. Is there a open source software to visualize the graph and zoom in and out.
Gephi would be a good solution. It offers many display styles and layouts as well as using OpenGL to render the graph which makes it fast. It can also read a variety of formats and is extensible.
GraphViz is a very well known OSS graphing tool. If you can print your graph in one of its supported formats you should be ready to go.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
Do any open source libraries exist for programatically selecting and rating the compatibility of sets of colors using color theory?
It would be very useful to be able to select color palettes based on simple color harmony rules like complimentary, analogous, triadic, and tetradic colors.
I just found this: Harmonies theory and math.
Also of interest is the rest of the EasyRGB site, which will explain how to do RGB to HSV, etc.
While it's not source code, it's the formulas for calculating the values.
Also interesting: "Color Jack"
This isn't a direct answer and it's not open source, but you might take a look at what they are doing at Adobe's Kuler web site. They have API Documentation that might be worth a read.