{"id":5400,"date":"2014-03-30T21:27:59","date_gmt":"2014-03-30T21:27:59","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/03\/30\/determine-input-encoding-collection-of-common-programming-errors\/"},"modified":"2014-03-30T21:27:59","modified_gmt":"2014-03-30T21:27:59","slug":"determine-input-encoding-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/03\/30\/determine-input-encoding-collection-of-common-programming-errors\/","title":{"rendered":"Determine input encoding-Collection of common programming errors"},"content":{"rendered":"<p>Generally checking whether input is UTF is a matter of heuristics &#8212; there&#8217;s no definitive algorithm that&#8217;ll state you &#8220;yes\/no&#8221;. The more complex the heuristic, the less false positives\/negatives you will get, however there&#8217;s no &#8220;sure&#8221; way.<\/p>\n<p>For an example of heuristics you can check out this library : http:\/\/utfcpp.sourceforge.net\/<\/p>\n<pre><code>bool valid_utf8_file(iconst char* file_name)\n{\n    ifstream ifs(file_name);\n    if (!ifs)\n        return false; \/\/ even better, throw here\n\n    istreambuf_iterator it(ifs.rdbuf());\n    istreambuf_iterator eos;\n\n    return utf8::is_valid(it, eos);\n}\n<\/code><\/pre>\n<p>You can either use it, or check its sources how they have done it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Generally checking whether input is UTF is a matter of heuristics &#8212; there&#8217;s no definitive algorithm that&#8217;ll state you &#8220;yes\/no&#8221;. The more complex the heuristic, the less false positives\/negatives you will get, however there&#8217;s no &#8220;sure&#8221; way. For an example of heuristics you can check out this library : http:\/\/utfcpp.sourceforge.net\/ bool valid_utf8_file(iconst char* file_name) { [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-5400","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/5400","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=5400"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/5400\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=5400"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=5400"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=5400"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}