{"id":5915,"date":"2014-04-10T07:47:07","date_gmt":"2014-04-10T07:47:07","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/04\/10\/mysql-find-cyrillic-or-greek-text-in-database-collection-of-common-programming-errors-2\/"},"modified":"2014-04-10T07:47:07","modified_gmt":"2014-04-10T07:47:07","slug":"mysql-find-cyrillic-or-greek-text-in-database-collection-of-common-programming-errors-2","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/04\/10\/mysql-find-cyrillic-or-greek-text-in-database-collection-of-common-programming-errors-2\/","title":{"rendered":"mysql: find cyrillic or greek text in database-Collection of common programming errors"},"content":{"rendered":"<p>I have a UTF8 table in MySQL containing names, with all types of text (numeric, capitals, greek, cycrillic etc).<\/p>\n<pre><code>---------------\nID   Name\n---------------\n001  Jane Smith\n002  John Doe\n003  ????? ????\n004  ????? ????\n005  \"Groove\" Holme\n006  99er Dude\n<\/code><\/pre>\n<p>How can I select only the cyrillic names? (records 003 and 004)<\/p>\n<p><strong>EDIT<\/strong><\/p>\n<p>Thanks for the answer below, which looked like it would be correct, but didn&#8217;t work. More research turned up this in the documentation:<\/p>\n<blockquote>\n<p>Warning<\/p>\n<p>The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.<\/p>\n<\/blockquote>\n<p><strong>EDIT EDIT, A SOLUTION<\/strong><\/p>\n<p>I solved this by adding an extra field to my database which stores the script type, eg Cyrillic, Thai etc. Then a ran a batch process in PHP that detects the script and stores the information in the database.<\/p>\n<p>To detect the script in PHP, use Unicode regex functions. See this page:<\/p>\n<p>http:\/\/www.regular-expressions.info\/unicode.html<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I have a UTF8 table in MySQL containing names, with all types of text (numeric, capitals, greek, cycrillic etc). &#8212;&#8212;&#8212;&#8212;&#8212; ID Name &#8212;&#8212;&#8212;&#8212;&#8212; 001 Jane Smith 002 John Doe 003 ????? ???? 004 ????? ???? 005 &#8220;Groove&#8221; Holme 006 99er Dude How can I select only the cyrillic names? (records 003 and 004) EDIT Thanks [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-5915","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/5915","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=5915"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/5915\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=5915"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=5915"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=5915"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}