How to name a two word variable - naming

I'm not a native English speaker and I'm wondering how's better to naming a variable with multiple words.
For example, I have two urls. One is for querying and the other is for submitting.
Should them be named as url_query, url_submit? Or querying_url, submitting_url? Or query_url, submit_url? Or things like submission_url?
I prefer url_query and url_submit. They are neat if I have a lot different urls for different purposes. But it sounds strange if I say it out.

This is a highly opinionated subject and it may be best to consult a specific coding style of your choice.
However, in general you want you variable names to be easily understood and readable.
For this reason items such as url_submit and url_query may be better option, as both are fairly understandable.
There are many different naming conventions, but the most important factor here is readability and implementation. Remember that comments will also help the readability as well.

You can name them however they feel right to you. Typically I camelCase them however instead of using underscores.
submitUrls/queryUrls
Try to think globally however as you may have these replicated in other pages and you may be able to build off your existing names like *action/*Thing(ie submit + what to submit "submitUrls")
Also, If something doesn't seem to make a good deal of sense to you when you're naming it. It is wise to add a line of comments in to identify what this function does later down the road if you have a bunch of functions named of actions and items.(Things)

Related

Best practices for creating a customized report based on user form input?

My Question
What are the best practices for creating a customized report based on a user form input? Specifically, how do I create an easy to maintain system which takes user input which is collected in a form and generate multiple paragraphs that explains the results of analysis.
Background
I am working on a very large multiyear project with a startup (who is my client). My job is to program analysis and generate reports to users. The pipeline for data looks like this:
Users enter information into a form -> results are calculated based on user input -> reports are displayed to users that share analysis.
It is really important to my client that some of the analysis results are displayed in paragraphs in a non-formal user friendly tone. The challenge is that the form and analysis are quite complex and will only get more complex over time. An example of the type of template for the paragraphs looks something like this:
resultsParagraphText=`Hi ${userName}. We found that the best ice cream flavour for you is ${bestIceCreamFlavor}. These other flavors ${otherFlavors} might be good for you. Here are the reasons why you might enjoy these flavors: ${reasonsWhyGoodFlavors}.
However we would not recommend these other flavors ${badFlavors}. Here are the reasons you should avoid this bad flavors: ${reasonsWhyBadFlavors}.`
These results paragraphs, of which there of many, have several minor problems which combined are significant:
If there is a bug in the code, minor visual errors would be visible to end users (capitalization errors, missing/extra commas, and so on).
A lot of string comparisons (e.g. if answers.previousFlavors.includes("Vanilla")) are required to generate the results paragraphs. Minor errors in the forms (e.g. vanilla in the form is not capitalized so answers.previousFlavors.includes("Vanilla") returns false even when user enters vanilla.) can cause errors in the results paragraph.
Changes in different parts of the project (form, analysis) directly effect how the results paragraph is made. Bad types, differences in string values, null or undefined values not being caught directly have an impact on how the results paragraph is made.
There are many edge cases (e.g. What if the user has no other suitable good flavors for them? The the sentence These other flavors ${otherFlavors} might be good for you. needs to be excluded).
It is hard to write paragraphs that use templates and have a non-formal tone.
and so on.
I have charts and other types of ways to display results and have explained to the client the challenges of sharing the information in paragraph form.
What I am looking for
I need examples, how tos, best practices on how to build a maintainable system for generating customized paragraphs based on user input. I know how to solve each of the individual issues (as they are fairly simple) but in a large project this will become very hard to maintain.
Notes
I have no clue what tags to use for the post. Feel free to edit/add tags if you know more appropriate ones.
The project is planning to use machine learning in the future other parts of the project. If there is a ML/AI solution that is useful please tell me.
I am working primarily in JavaScript, Python, C, and R, but if there is a library or tool in any other language please tell me. Finding a solution is very important to me and I would be willing to learn a lot find a best solution.
To avoid this question being removed because I have rephrased it to avoid asking for personal opinion, instead asking for existing examples or how tos. I can also imagine that others might find a solution fairly useful. If you can edit it to make the question less subjective please do so.
If you have any questions or need clarification feel free to ask. Any help is appreciated.

Parameters in method

I am having a method in which there are more then 7 parameters ,type of all the parameters are different.
Well my question is it fine or i should replace all parameters with the single class(which will contaion all parameters as a instance variable).
Well my question is it fine or i should replace all parameters with the single class
7 is way too much. Replace with a class. With my VS custom theme and fonts settings Intellisense wouldn't fit on the screen when there is a method with so many parameters :-) I find it more readable and easier to understand when working with classes.
Of course those are just my 2 cents and it's subjective. I've seen people writing methods with many many parameters.
Well, there are places where I'd consider that okay - but they're few and far between. I would generally consider using a "parameter class" instead.
Note that it doesn't have to be an "all or nothing" approach - would it make more sense to encapsulate, say, 4 of the parameters together? Would that allow the new class to be used in other methods?
Other thing to consider is whether the method might be doing too much - does the functionality of the method definitely feel right as a single cohesive unit?
Thare are Microsoft APIs with even more parameters; anyway I agree with #Darin, a class could be a good solution, clear and efficient that can avoid passing parameters in the right order... so for example if you change order for oen without using refactoring you're in a mess...
Consider also if that class can be used in other parts of the program...
I somewhat agree with all given answers. However it might be a good idea to break down the method into smaller parts as Jon is suggesting. Usually when you have a situation like this means that the method is doing too much. Typically methods should have 1 or 2 parameters at most. While you would achieve the same by replacing all your parameters with a class parameter you might just be hiding the bigger issue. Is there any chance you can post the method or describe what it's doing?
As a rule of thumb, if you can't fit the entire method, including declaration onto 1 screen, it's too big. Programmers will normally recommend to break down methods to make them reusable. But, it also makes perfect sense to break them down to increase readability.

I have a list of names, some of them are fake, I need to use NLP and Python 3.1 to keep the real names and throw out the fake names

I have no clue of where to start on this. I've never done any NLP and only programmed in Python 3.1, which I have to use. I'm looking at the site http://www.linkedin.com and I have to gather all of the public profiles and some of them have very fake names, like 'aaaaaa k dudujjek' and I've been told I can use NLP to find the real names, where would I even start?
This is a difficult problem to solve, and one which starts with acquiring valid given name & surname lists.
How large is the set of names that you're evaluating, and where do they come from? These are both important things for you to consider. If you're evaluating a small set of "American" names, your valid name lists will differ greatly from lists of Japanese or Indian names, for instance.
Your idea of scraping LinkedIn is on the right track, but you were right to catch the fake profile/name flaw. A better website would probably be something like IMDB (perhaps scraping names by iterating over different birth years), or Wikipedia's lists of most popular given names and most common surnames.
When it comes down to it, this is a precision vs. recall problem: in order to miss fewer fakes, you're inevitably going to throw out some real names. If you loosen up your restrictions, you'll get more fakes, but you'll also throw out fewer real names.
Several possibilities here, but the most obvious seems to be with HMMs, i.e. Hidden Markov Models. The NLTK kit includes [at least] one module for HMMs, although I must admit I never used it.
Another possible snag is that AFAIK, NTLK is not yet ported to Python 3.0
This said, and while I'm quite keen on using NLP techniques where applicable, I think that a process which would use several paradigms, including some NLP tricks may be a better solution for this particular problem. For example, storing even a reduced dictionary of common family names (and first names) in a traditional database may offer both a more reliable and more computationally efficient way of filtering a significant portion of the input data, leaving precious CPU resources to be spent on less obvious cases.
i am afraid this problem is not solveable if your list is even only minimally ‘open’ — if the names are eg customers from a small traditionally acting population, you might end up with a few hundred names for thousands of people. but generally you can hardly predict what is a real name and what is not, however unusual an arabic, chinese, or bantu name may look in a sample of, say, south english rural neighborhood names. i mean, ‘Ng’ is a common cantonese surname, and ‘O’ is common in korea, so assumptions may fail. there is this place in austria called ‘fucking’, so even looking out for four letter words is no guarantee for success.
what you could do is work through a sufficiently big sample of such names and sort them out manually. then, use all kinds of textprocessing tools and collect metrics. maybe you can derive a certain likelyhood for a name to be recognized as fake, maybe it will not be viable. you will never go beyond likelyhoods here, though.
as an aside, we used to use google maps and the telephone directory for validating customer data years ago. if google maps could find the place, we called the address validated. it is clear that under stricter requirements, true validation must go much further. let’s not forget the validation of such data is much more a social question than a linguistic one.

what is the best UX for non-programmer users? comma-separated tags or space-separated tags?

I'm creating a social site for teachers (non-programmers) on which teachers can add events, links, exercises, tips, lesson plans, books, etc.
Each of these items I want them to be able to add tags to as we do at StackOverflow.
However, because they are non-programming users, I thought that space-separated, nonspace tags and camelCase tags would lead to too much confusion, e.g.:
grammar teachingtips universityOfMinnesota phrasalverbs
and indeed on this similar stackoverflow question most of the answers suggested commas like this:
grammar, teaching tips, university of minnesota, phrasal verbs
but then I just signed up for a delicious.com account (which I don't think has a very programmer-centric audience) and saw that they use spaces as well:
separate tags with spaces: e.g. hotels bargains newyork (not new york)
What has been your experience on this point in terms of the current UX trend for tags? Is the average Internet user accostumed to space-separated tags by now? I have to admit, I have never seen comma-separated tags on any major site I have used. Have you come upon a good way to combine them so it doesn't even matter, e.g.:
grammar book reviews teaching tips
and e.g. have a quick algorithm which checks the number of current tags for:
grammar
grammar book
grammar book reviews
book
book reviews
book reviews teaching
...
I'd go comma separated personally. You'll note that Stackoverflow doesn't but the tags are clearly delineated into their own boxes. Plus hyphens are often used for "spacing". I'd say spaces are more natural to non-programmers than hyphens are however.
Comma separated seems the most natural - it's what English uses to punctuate lists. It also allows you to have spaces in tags if you want. People will try to enter
this, that, the other
and expect it to work.
I can't think of a good reason to use spaces.
Notice that delicious has to give an example to demonstrate how to do it their way. That's not a good sign.
If you do go with commas, take care to see how easy it is for a "space user" to see that they made a mistake, and to fix it.
I would go with comma separated tags, if only to save your users the pain of having to use quotes to indicate a tag has a space in it, ie website "stack overflow" tips, or website, stack overflow, tips. I know which I'd prefer.
Comma-separated is the way to go for your educational audience. It's simply intuitive.
Most teachers should have no trouble understanding a system where tags are comma separated, and there is no need to come up with an awkward workaround for phrases.
It depends a little on how the tags are entered. If the user gets suggestions for tags as they type like SO provides (shades of intellisense), space separated is probably fine. However, if you are going to force the user to enter each tag without a reference list it may be easier to accept case-insensitive comma (or semicolon) delimited tags.
You don't want to check all those possibilities unless you are going to severely limit the number of possible tags - that's an O(n!) algorithm, and you most likely don't want to have that extra load on your server.
Your best bet is probably just to stick with one option - the users will (should!) get used to it fairly quickly. Spaces as separators are probably the most common, so I would go with that, since it is the one the users are most likely to have had prior exposure to.
As long as what the software accepts/demands is clear, I think users will be happy with either. Confusion comes when they don't know whether to use commas, semicolons, spaces or...
If you use a number of e-mail clients you'll know how useful a simple tool-tip reminder of whether it's commas or spaces would be when entering multiple recipients.
When tagging, how you set it up depends on what kinds of things you will tag. Media that is hard to index, like pictures, audio, or video, should encourage many and varied tags, because the tags are how you will search the content.
Easily indexed content (text!) should use a very rigid tagging structure, because you don't need to rely on tags for search indexing. Instead, the purpose of tags is sort the content into well-defined categories. Tags should be more like labels or folders.
I'm gonna take a guess here that this content will be mostly text-based, with the occasional picture or video file thrown in. So you don't want either comma or space separated tag entry, but rather some mechanism that forces users to pick from an existing set of tags.
I would assume space separated tags unless there are one or more commas, in which case you should split on commas instead. In other words, support both but in a limited way. You can probably guess right 90+ percent of the time.

Suggestions wanted: What should I name a class that represents a real-life "event"?

I need to define a class that represents a real-life event -- like a "movie premier", "party", etc. I will be creating a set of database objects to store the data for each event and define relationships with other data we have (e.g. venue, people, etc.).
The word "event" has a different meaning in the context of software, so I'd like to name the class something other than "event".
Any ideas on another name?
This related post asks a similar question, but I think all of the suggestions lack creativity -- basically #event, the case-sensitive name Event or use the old-school naming convention CEvent. Technically good solutions, but they don't help when discussing the data objects with peers (my speech and listening abilities are case-insensitive) and don't convey any information on how the class is not an event in the traditional use of the term.
One option would be CalendarEvent, to make it obvious that this is a real-world event tied to a given date.
Activity come to mind.
How about Happening or Occasion?
Normally I'd recommend function, but it too has specific meanings in the context of software. ;)
Occasion might be a good synonym.
The thesaurus lists the following as synonyms of the word event:
accident, act, action, advent,
adventure, affair, appearance,
business, calamity, case, catastrophe,
celebration, ceremony, chance,
circumstance, coincidence,
conjuncture, crisis, deed,
development, emergency, episode,
experience, exploit, fact, function,
holiday, incident, juncture, marvel,
matter, milestone, miracle,
misfortune, mishap, mistake, occasion,
occurrence, pass, phase, phenomenon,
predicament, proceeding, shift,
situation, story, thing*, tide,
transaction, triumph, turn, wonder
Surely one of them would suffice... if not, you can prepend or wrap the word event to make it a non-keyword. Something like #event or [event] although, I have to say that I don't personally like this practice even though it is syntactically permissable.
You could call it a 'Rendezvous'. You could also just make up a word. If this is a key concept in your domain you could abbreviate one of the other suggested names. Things like:
CalenderEvent becomes Calvent
SocialEvent becomes Socent
RealWorldEvent becomes Revent
HumanActivity becomes HAct
Those quick examples might be terrible examples but they are short, don't collide with language or library names, and will become real meaningful words for you and your coworkers very quickly if you work with them frequently.
Perhaps "Affair" or "Advent" -- you could also check the thesaurus:
http://thesaurus.reference.com/browse/event
Entry or EventEntry are probably what I would go with.
I can appreciate you want to avoid confusion with events in the programming sense, but my take on it is that maybe you should go with the most obvious name; program to your domain, and things stay readable and easier to design and maintain.

Resources