Editing XML ProtectedString CDATA - node.js

Third time posting on StackoverFlow so I apologize for any mistakes. :)
I am trying to edit the Character Data of ProtectedString
<ProtectedString name="Source"><![CDATA[print("CHANGEME")]]><ProtectedString>
It prints on the output
<![CDATA[print("CHANGEME")]]>
But does not save.
Code so far:
fs.readFile('./1.rbxlx', 'utf8', function(err, data) {
var result = xmljs.xml2js(data, {
compact: true,
spaces: 2
})
for (var i = 17; i <= 17; i++) {
result.roblox["Item"][0]["Item"][i].Properties.ProtectedString = '<![CDATA[print("CHANGED")]]>'
console.log(result.roblox["Item"][0]["Item"][i].Properties.ProtectedString)
}
fs.writeFile("./1.rbxlx", xmljs.js2xml(result, {
compact: true,
spaces: 4,
fullTagEmptyElement: true
}), function(err, data) {
if (err) console.error(err)
})
})
Thank you in advance :)

CDATA does not really exist.
It's a way to represent a text value during serialization. "Serialization" means "converting a complex data structure to text" - and "XML" is the representation of a tree of nodes. The tree of nodes (a.k.a. the document) is the important thing here, "XML" is nothing more than a container format.
CDATA is not part of the document tree. After you read a file with an XML parser (such as xml2js), CDATA will be gone, just like all the angle brackets and the quotes from the XML are gone - what remains is the value it stood for:
console.log(result.roblox["Item"][0]["Item"][i].Properties.ProtectedString);
This will log the actual text value:
print("CHANGEME")
So you need to update the actual text value:
result.roblox["Item"][0]["Item"][i].Properties.ProtectedString = 'print("CHANGED")';
When the document gets converted to XML again ("serialized"), there probably won't be a <![CDATA[...]]> wrapper around the value anymore. That's a minor detail that you can safely ignore. CDATA is completely optional and XML does not need it to function.
The component that writes the XML (the serializer) decides whether it wants to use CDATA for a text value or not. In xml2js, this component is the Builder. You can tell the Builder to use CDATA via an option, but your control over where it does it is limited:
cdata (default: false): wrap text nodes in <![CDATA[ ... ]]> instead of escaping when necessary. Does not add <![CDATA[ ... ]]>
if it is not required. Added in 0.4.5.
It might even add CDATA where there was none before.
Overall: Don't worry about it. The text values will be correct with or without CDATA. Don't build software or processes that depend on CDATA being there, that's always a mistake.

Related

NodeJS why is object[0] returning '{' instead of the first property from this json object?

So I have to go through a bunch of code to get some data from an iframe. the iframe has a lot of data but in there is an object called '_name'. the first key of name is 'extension_id' and its value is a big long string. the json object is enclosed in apostrophes. I have tried removing the apostrophes but still instead of 'extension_id_output' I get a single curly bracket. the json object looks something like this
Frame {
...
...
_name: '{"extension_id":"a big huge string that I need"} "a bunch of other stuff":"this is a valid json object as confirmed by jsonlint", "globalOptions":{"crev":"1.2.50"}}}'
}
it's a whole big ugly paragraph but I really just need the extension_id. so this is the code I'm currently using after attempt 100 or whatever.
var frames = await page.frames();
// I'm using puppeteer for this part but I don't think that's relevant overall.
var thing = frames[1]._name;
console.log(frames[1])
// console.log(thing)
thing.replace(/'/g, '"')
// this is to remove the apostrophes from the outside of the object. I thought that would change things before. it does not. still outputs a single {
JSON.parse(thing)
console.log(thing[0])
instead of getting a big huge string that I need or whatever is written in extension_id. I get a {. that's it. I think that is because the whole object starts with a curly bracket. this is confirmed to me because console.log(thing[2]) prints e. so what's going on? jsonlint says this is a valid json object but maybe it's just a big string and I should be doing some kind of split to grab whaat's between the first : and the first ,. I'm really not sure.
For two reasons:
object[0] doesn't return the value an object's "first property", it returns the value of the property with the name "0", if any (there probably isn't in your object); and
Because it's JSON, and when you're dealing with JSON in JavaScript code, you are by definition dealing with a string. (More here.) If you want to deal with the object that the JSON describes, parse it.
Here's an example of parsing it and getting the value of the extension_id property from it:
const parsed = JSON.parse(frames[1]._name);
console.log(parsed.extension_id); // The ID

nodejs skipping single quote from json key in output

I see a very weird problem when json when used in nodejs, it is skipping single quote from revision key . I want to pass this json as input to node request module and since single quote is missing from 'revision' key so it is not taking as valid json input. Could someone help how to retain it so that I can use it. I have tried multiple attempts but not able to get it correct.
What did I try ?
console.log(jsondata)
jsondata = {
'splits': {
'os-name': 'ubuntu',
'platform-version': 'os',
'traffic-percent': 100,
'revision': 'master'
}
}
Expected :-
{ splits:
{ 'os-name': 'ubuntu',
'platform-version': 'os',
'traffic-percent': 100,
'revision': 'master'
}
}
But in actual output single quote is missing from revision key :-
{ splits:
{ 'os-name': 'ubuntu',
'platform-version': 'os',
'traffic-percent': 100,
revision: 'master'
}
}
Run 2 :- Tried below code this also produce same thing.
data = JSON.stringify(jsondata)
result = JSON.parse(data)
console.log(result)
Run 3:- Used another way to achieve it
jsondata = {}
temp = {}
splits = []
temp['revision'] = 'master',
temp['os-name'] = 'ubuntu'
temp['platform-version'] = 'os'
temp['traffic-percent'] = 100
splits.push(temp)
jsondata['splits'] = splits
console.log(jsondata)
Run 4: tries replacing single quotes to double quotes
Run 5 : Change the order of revision line
This is what is supposed to happen. The quotes are kept only if the object key it’s not a valid JavaScript identifier. In your example, the 'splits' & 'revision' don't have a dash in their name, so they are the only ones with the quotes removed.
You shouldn't receive any error using this object - if you do, update this post mentioning the scenario and the error.
You should note that JSON and JavaScript are not the same things.
JSON is a format where all keys and values are surrounded by double quotes ("key" and "value"). A JSON string is produced by JSON.stringify, and is required by JSON.parse.
A JavaScript object has very similar syntax to the JSON file format, but is more flexible - the values can be surrounded by double quotes or single quotes, and the keys can have no quotes at all as long as they are valid JavaScript identifiers. If the keys have spaces, dashes, or other non-valid characters, then they need to be surrounded by single quotes or double quotes.
If you need your string to be valid JSON, generate it with JSON.stringify. If it's OK for it to be just valid JavaScript, then it's already fine - it does not matter whether the quotes are there or not.
If, for some reason, you need some imaginary third option (perhaps you are interacting with an API where someone has written their own custom string parser, and they are demanding that all keys are surrounded by single quotes?) you will probably need to write your own little string generator.

Parsing formatted strings in Go

The Problem
I have slice of string values wherein each value is formatted based on a template. In my particular case, I am trying to parse Markdown URLs as shown below:
- [What did I just commit?](#what-did-i-just-commit)
- [I wrote the wrong thing in a commit message](#i-wrote-the-wrong-thing-in-a-commit-message)
- [I committed with the wrong name and email configured](#i-committed-with-the-wrong-name-and-email-configured)
- [I want to remove a file from the previous commit](#i-want-to-remove-a-file-from-the-previous-commit)
- [I want to delete or remove my last commit](#i-want-to-delete-or-remove-my-last-commit)
- [Delete/remove arbitrary commit](#deleteremove-arbitrary-commit)
- [I tried to push my amended commit to a remote, but I got an error message](#i-tried-to-push-my-amended-commit-to-a-remote-but-i-got-an-error-message)
- [I accidentally did a hard reset, and I want my changes back](#i-accidentally-did-a-hard-reset-and-i-want-my-changes-back)
What I want to do?
I am looking for ways to parse this into a value of type:
type Entity struct {
Statement string
URL string
}
What have I tried?
As you can see, all the items follow the pattern: - [{{ .Statement }}]({{ .URL }}). I tried using the fmt.Sscanf function to scan each string as:
var statement, url string
fmt.Sscanf(s, "[%s](%s)", &statement, &url)
This results in:
statement = "I"
url = ""
The issue is with the scanner storing space-separated values only. I do not understand why the URL field is not getting populated based on this rule.
How can I get the Markdown values as mentioned above?
EDIT: As suggested by Marc, I will add couple of clarification points:
This is a general purpose question on parsing strings based on a format. In my particular case, a Markdown parser might help me but my intention to learn how to handle such cases in general where a library might not exist.
I have read the official documentation before posting here.
Note: The following solution only works for "simple", non-escaped input markdown links. If this suits your needs, go ahead and use it. For full markdown-compatibility you should use a proper markdown parser such as gopkg.in/russross/blackfriday.v2.
You could use regexp to get the link text and the URL out of a markdown link.
So the general input text is in the form of:
[some text](somelink)
A regular expression that models this:
\[([^\]]+)\]\(([^)]+)\)
Where:
\[ is the literal [
([^\]]+) is for the "some text", it's everything except the closing square brackets
\] is the literal ]
\( is the literal (
([^)]+) is for the "somelink", it's everything except the closing brackets
\) is the literal )
Example:
r := regexp.MustCompile(`\[([^\]]+)\]\(([^)]+)\)`)
inputs := []string{
"[Some text](#some/link)",
"[What did I just commit?](#what-did-i-just-commit)",
"invalid",
}
for _, input := range inputs {
fmt.Println("Parsing:", input)
allSubmatches := r.FindAllStringSubmatch(input, -1)
if len(allSubmatches) == 0 {
fmt.Println(" No match!")
} else {
parts := allSubmatches[0]
fmt.Println(" Text:", parts[1])
fmt.Println(" URL: ", parts[2])
}
}
Output (try it on the Go Playground):
Parsing: [Some text](#some/link)
Text: Some text
URL: #some/link
Parsing: [What did I just commit?](#what-did-i-just-commit)
Text: What did I just commit?
URL: #what-did-i-just-commit
Parsing: invalid
No match!
You could create a simple lexer in pure-Go code for this use case. There's a great talk by Rob Pike from years ago that goes into the design of text/template which would be applicable. The implementation chains together a series of state functions into an overall state machine, and delivers the tokens out through a channel (via Goroutine) for later processing.

How can I test if a CK Editor field is empty

I want to test if a CKEditor ( Rich Text ) field is empty as part of some business logic.
I do not want to use the built in validation features.
If a CK Editor field has previously had text and then this text is deleted there is still content e.g.
<p dir="ltr">
</p>
I can get a handle to this text string using :
dataVar = xspdoc.getDocument().getMIMEEntity(dataNamevar).getContentAsText();
Is there a way to test if the CKEditor field is empty of visible text ?
Technically speaking, if it has what amounts to a a single visible newline in it as you've shown in your question, it isn't really "empty".
Realistically, you'll have to parse the content value to find out if there is content that is not either inside tags or the few special characters like and so on.
I tend to do this in js, if I have to, by taking the whole string of text and splitting it into an array based on "<" then taking each element of the array and removing an text to the left of an ">", then trim. That leaves me an array of either empty strings or text that is outside any tags. From there it's easy enough check for any of strings in the array to see if they are not empty, and not " ".
This may be more cumbersome then some built in parser that I don't know, but it's fairly reliable and quick. (and a very similar method can be used in formula language as well).
In ssjs formula you could:
var checkString = #trim(#replacesubstring(#implode( #trim (#right( #explode( sourceHTMLstring , "<" ) , ">" ) ) , " "), " " , ""));
if(checkstring == "") {
// *** You have no content
} else {
// *** you have content
}
Obviously this could be done just as easily in pure javascript, but the old formula language is so ingrained in my head, I'd go this way just out of habit.
** Also note: You may want to check for an <img> tag in there somewhere in case someone has done absolutely nothing other than put an image in the rich text.
CKEditor has its own API, I guess this is the right method to use:
http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.editor.html#getData
This might be helpful: http://xpagetips.blogspot.com/2011/10/be-careful-with-empty-ckeditor-rich.html
Check if CKEditor is empty
For any browser
var editor=CKEDITOR.instances.editorName.getData();
I found best answer for this
function validateCKEDITORforBlank(ckData)
{
ckData = ckData.replace(/<[^>]*>|\s/g, '');
var vArray = new Array();
vArray = ckData.split(" ");
var vFlag = 0;
for(var i=0;i<vArray.length;i++)
{
if(vArray[i] == '' || vArray[i] == "")
{
continue;
}
else
{
vFlag = 1;
break;
}
}
if(vFlag == 0)
{
return true;
}
else
{
return false;
}
}
Link

Detect a change in a rich text field's value in SPItemEventReceiver?

I currently have an Event Receiver that is attached to a custom list. My current requirement is to implement column level security for a Rich Text field (Multiple lines of text with enhanced rich text).
According to this post[webarchive], I can get the field's before and after values like so:
object oBefore = properties.ListItem[f.InternalName];
object oAfter = properties.AfterProperties[f.InternalName];
The problem is that I'm running to issues comparing these two values, which lead to false positives (code is detecting a change when there wasn't one).
Exhibit A: Using ToString on both objects
oBefore.ToString()
<div class=ExternalClass271E860C95FF42C6902BE21043F01572>
<p class=MsoNormal style="margin:0in 0in 0pt">Text.
</div>
oAfter.ToString()
<DIV class=ExternalClass271E860C95FF42C6902BE21043F01572>
<P class=MsoNormal style="MARGIN: 0in 0in 0pt">Text.
</DIV>
Problems?
HTML tags are capitalized
Random spaces (see the additional space after margin:)
Using GetFieldValueForEdit or GetFieldValueAsHTML seem to result in the same values.
"OK," you say, so lets just compare the plain text values.
Exhibit B: Using GetFieldValueAsText
Fortunately, this method strips all of the HTML tags out of the value and only plain text is displayed. However, using this method led me to discover additional issues with whitespace characters:
In the before value:
Sometimes there are additional newline characters.
Sometimes spaces are displayed as non-breaking spaces (ASCII char code 160)
Question:
How can I detect if the user changed a rich text field in an event receiver?
[Ideal] Detect any change to HTML or text or white space
[Acceptable] Detect changes to text or white space
[Not so good] Detect changes to text characters only (strip all non-alphanumeric characters)
What happens if you set the ListItem field with the new value and read it back out? Does that give the same formatting?
object oBefore = properties.ListItem[f.InternalName];
properties.ListItem[f.InternalName] = properties.AfterProperties[f.InternalName]
object oAfter = properties.ListItem[f.InternalName];
//dont update
properties.ListItem[f.InternalName] = oBefore;
I would probably try something between choices 2 and 3:
bool changed =
valueAsTextBefore != valueAsTextAfter ||
0 != string.Compare(
oBefore.ToString().Replace(" ", ""),
oAfter.ToString().Replace(" ", ""),
true);
The left half checks if the text (including case) has changed while the right half checks if the tags or attributes have changed. Very kludgy, but should fit your case.
The only other thing I can think of is to run an XML transform on the HTML in order to standardize on case and spacing. But not only does that seem like overkill, but it assumes the HTML will always be well formed.
I'm currently testing a combination approach: GetFieldValueAsText and then stripping out all characters except alphanumeric/punctuation:
static string GetRichTextValue(string value)
{
if (null == value)
{
return string.Empty;
}
StringBuilder sb = new StringBuilder(value.Length);
foreach (char c in value)
{
if (char.IsLetterOrDigit(c) || char.IsPunctuation(c))
{
sb.Append(c);
}
}
return sb.ToString();
}
This only detects changes to the text of a rich text field but seems to work consistently.

Resources