Ruby Markdown / String Processing – Issue with Encoding?-Collection of common programming errors


  • Brandon

    Good afternoon,

    I am trying to use a javascript based editor (EpicEditor) to allow for admins on my site to input markdown for blog posts. On clicking the submit button, the string generated by the editor is sent to the Ruby/Rails server for processing into HTML via RDiscount.

    It mostly seems to be working, with the exception of something to do with  , spaces and the like. Clicking the “preview” button in EpicEditor is giving the exact output I expect, so I knew it was something to do with the way the string was being sent to the server. I’m still fairly new to ruby, and not very good with string encodings etc. I’m sure this is a fairly simple question for the right person.

    To illustrate the issue – just showing the code is probably best. I’m trying to input a bulleted list underneath an ordered list. As such, the markdown would be something like:

    1. Hello 2. Goodbye - a list - entry

    3. For something

    My issue seems to be in and around the newline after the ordered list entry. The string saved to the database, as well as what was in the javascript editor appeared exactly the same: "\n - had\n"

    However on further inspection, there was a slight difference between the “working string” (what was being used in the preview view in EpicEditor) and the “failing string” (the exact string stored in my database that had been passed by the form):

    working_string.each_byte {|c| puts c} --> 10, 32, 32, 45, 32, 104, 97, 100, 10 failing_string.each_byte {|c| puts c}

    --> 10, 194, 160, 32, 45, 32, 104, 97, 100, 10

    Somehow/someway, a 32 byte was being changed for 194 160. On further research, this appears to be some issue with a regular space in a string, vs. a \xC2 or \xA0, and has something to do (I think) with  ‘s.

    Is there an easy way to make sure the string gets passed correctly going from EpicEditor to the rails server, and into the database? I can also work with string substitutions on the server side if it will help….

    Thanks!

    UPDATE 1

    I went through and tried to find the exact issue. I grabbed the text from Epiceditor, from the textarea I was mirroring from Epiceditor (to submit with the form), the params that went over the wire, and the string in the database. The markdown was as follows (copied directly from the epiceditor text area:

    ## Testing this yet again
    
    Because I want more!! More things to have a good time....
    
    1.  Everywhere!
    2.  Anywhere!
      - happy place
      - sad place
    3.  Goodbye
    

    If I paste this directly in to this input, you can see it properly renders:

    Testing this yet again

    Because I want more!! More things to have a good time….

    1. Everywhere!
    2. Anywhere!
    3. Goodbye

    The ASCII characters of this string are as follows:

    [35, 35, 32, 84, 101, 115, 116, 105, 110, 103, 32, 116, 104, 105, 115, 32, 121, 101, 116, 32, 97, 103, 97, 105, 110, 10, 10, 66, 101, 99, 97, 117, 115, 101, 32, 73, 32, 119, 97, 110, 116, 32, 109, 111, 114, 101, 33, 33, 32, 77, 111, 114, 101, 32, 116, 104, 105, 110, 103, 115, 32, 116, 111, 32, 104, 97, 118, 101, 32, 97, 32, 103, 111, 111, 100, 32, 116, 105, 109, 101, 46, 46, 46, 46, 10, 10, 49, 46, 32, 32, 69, 118, 101, 114, 121, 119, 104, 101, 114, 101, 33, 10, 50, 46, 32, 32, 65, 110, 121, 119, 104, 101, 114, 101, 33, 10, 32, 32, 45, 32, 104, 97, 112, 112, 121, 32, 112, 108, 97, 99, 101, 10, 32, 32, 45, 32, 115, 97, 100, 32, 112, 108, 97, 99, 101, 10, 51, 46, 32, 32, 71, 111, 111, 100, 98, 121, 101]
    

    I am copying the editor contents to a textarea via exportFile(). The ASCII characters in that textarea are as follows:

    [35,35,32,84,101,115,116,105,110,103,32,116,104,105,115,32,121,101,116,32,97,103,97,105,110,10,10,66,101,99,97,117,115,101,32,73,32,119,97,110,116,32,109,111,114,101,33,33,32,77,111,114,101,32,116,104,105,110,103,115,32,116,111,32,104,97,118,101,32,97,32,103,111,111,100,32,116,105,109,101,46,46,46,46,10,10,49,46,32,160,69,118,101,114,121,119,104,101,114,101,33,10,50,46,32,160,65,110,121,119,104,101,114,101,33,10,160,32,45,32,104,97,112,112,121,32,112,108,97,99,101,10,160,32,45,32,115,97,100,32,112,108,97,99,101,10,51,46,32,160,71,111,111,100,98,121,101]
    

    If you diff these arrays, I get the following:

    textarea - epiceditor
     => [160, 160, 160, 160, 160] 
    

    In other words, there appears to be extra   ‘s still being output from EpicEditor in the exportFile(). Any thoughts on where this might be coming from? I complete the mirroring as follows:

    var post_body = $("#post_body");
    var content = post_body.val();
    
    var editor = new EpicEditor(opts);
    
    editor.on('load', function () {
      editor.importFile(window.location.href, content); //Imports a file when the user clicks this     button
    }); 
    
    editor.on('save', function () {
      post_body.val(editor.exportFile());
    });
    

    UPDATE 2

    P.S. The following fixes the exportFile() output. I’m happy that at the very least I’ve gotten it figured out, but I was hoping to get something that works natively with EpicEditor!

    bad = "#{194.chr}#{160.chr}".force_encoding('utf-8')
    good = 32.chr
    self.body = body.gsub(bad, good)