Determine input encoding-Collection of common programming errors

Generally checking whether input is UTF is a matter of heuristics — there’s no definitive algorithm that’ll state you “yes/no”. The more complex the heuristic, the less false positives/negatives you will get, however there’s no “sure” way.

For an example of heuristics you can check out this library : http://utfcpp.sourceforge.net/

bool valid_utf8_file(iconst char* file_name)
{
    ifstream ifs(file_name);
    if (!ifs)
        return false; // even better, throw here

    istreambuf_iterator it(ifs.rdbuf());
    istreambuf_iterator eos;

    return utf8::is_valid(it, eos);
}

You can either use it, or check its sources how they have done it.