Forum Xamarin.Android

Announcement:

The Xamarin Forums have officially moved to the new Microsoft Q&A experience. Microsoft Q&A is the home for technical questions and answers at across all products at Microsoft now including Xamarin!

To create new threads and ask questions head over to Microsoft Q&A for .NET and get involved today.

Load html page with german umlaute into webview?

SimonRckertSimonRckert DEMember ✭✭
edited June 2013 in Xamarin.Android

Hello,

i load a webpage into a string and remove some elements(divs in content).
Then i try to show the html page in webview. That works but all german umlaute are shown as "?"

I tried several ways for a few hours but i cant get it working...

Fiddler is showing this as header for the loaded url:

Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip,deflate,sdch Accept-Language: de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36

The header of the html shows that:

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <meta name="robots" content="yes, all, follow,index" /> <meta name="revisit-after" content="1 days" /> <meta http-equiv="Content-Language" content="de" />

Im loading it in simplest way like that:

`

   protected override void OnCreate(Bundle savedInstanceState)
    {
        base.OnCreate(savedInstanceState);

        SetContentView(Resource.Layout.Books);
        var myWebView = FindViewById<WebView>(Resource.Id.webview);
        string url = "http://m.cetpm.de/shop.html?kat=1";

        var url2 = new Uri(url);
        HttpWebRequest http = (HttpWebRequest)WebRequest.Create(url2);
        //headers
        http.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36";
        http.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        http.Headers.Add("Accept-Encoding", "gzip,deflate,sdch");
        http.Headers.Add("Accept-Language", "de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4");
        http.Headers.Add("Accept-Charset", "ISO-8859-1,utf-8;q=0.7,*;q=0.3");

        using (WebResponse response = http.GetResponse()) {
            var reader = new StreamReader(response.GetResponseStream());
            var result = reader.ReadToEnd();

            var wc = new WebViewClient();
            myWebView.SetWebViewClient(wc);
            myWebView.LoadDataWithBaseURL(null, result, "text/html", "utf-8", null); //"utf-8" iso-8859-1
        }


    } 

`

I hope you can help me!

Best reagrds,

Simon

Posts

  • FZelleFZelle DEMember ✭✭✭

    It is the same problem every time you use the StreamReader.

    As a German you should know that you have to use

     new StreamReader(response.GetResponseStream(),System.Text.Encoding.Default);
    

    regardless of reading textfiles or everything else.

  • gregkogregko USMember
    edited July 2013

    From the web header, you can see that your page uses "iso-8859-1" encoding, while in your StreamReader constructor you skipped encoding, which (would have to double check) probably defaults to UTF-8. See if giving the correct encoding to StreamReader will fix your error. If not, you could read your page as byte array first, then convert it to string with the correct encoding parameter, like, at least in Java you would say:

    String s = new String(byteArray, "iso-8859-1");

    And maybe also you should change the header in your string, from content="text/html; charset=iso-8859-1", to content="text/html; UTF-8", if UTF-8 is what you are loading into web view.

    Greg

  • SimonRckertSimonRckert DEMember ✭✭

    Thank you. I tried both proposals but it dont work.

    I cant make any changes on the website...

    Has it worked on your side to display the page correctly?

  • SimonRckertSimonRckert DEMember ✭✭
    edited August 2013

    Dont know why but after 100 different combinations, this is now working for me.

    `

       public static void InitializeWebView(string siteUrl, CustomActivity activity)
        {
            try
            {
                var webClient = new WebClient();
                webClient.Headers.Add("Accept-Encoding", "gzip,deflate,sdch");
                webClient.Headers.Add("Content-Type", "text/html; charset=iso-8859-1");
                webClient.DownloadStringAsync(new Uri(siteUrl));
    
                webClient.DownloadStringCompleted += (x, y) =>
                          {
                              try
                              {
                                  var html = y.Result;
                                  html = HtmlHelperMethods.RemoveHeaderAndFooter(html);
    
                                  activity.RunOnUiThread(() =>
                                                             {
                                                                 var myWebView = activity.FindViewById<WebView>(Resource.Id.webview);
    
                                                                 var webViewClient =new CustomWebViewClient();
                                                                 myWebView.SetWebViewClient(webViewClient);
    
                                                                 myWebView.Settings.JavaScriptEnabled = true;
                                                                 myWebView.LoadUrl(siteUrl);
                                                             });
                              }
                              catch (Exception ex)
                              {
    
                              }
                          };
            }
            catch (Exception ex) {
    
            }
        }
    

    `

  • BrendanZagaeskiBrendanZagaeski USForum Administrator, Xamarin Team Xamurai

    Getting character sets to cooperate is always tricky. One option in this case (as hinted at by the other posters) is first to convert the stream to a "native" C# string when reading it in:

    new StreamReader (responseStream.GetResponseStream (), Encoding.GetEncoding ("iso-8859-1"));
    

    ... and then to load the string in the original encoding, with a data scheme URI passed to LoadUrl():

    webView.LoadUrl ("data:text/html;charset=iso-8859-1;base64," + Convert.ToBase64String (Encoding.GetEncoding ("iso-8859-1").GetBytes (responseContent)));
    

    (Simple example project attached)

    The argument passed to GetEncoding() must match the charset from the <meta http-equiv="Content-Type" content="text/html;charset=*" /> tag within the data stream. You could change this tag while making your other changes to responseContent if you wanted to. You could alternatively use Uri.EscapeDataString(), which converts all Unicode characters to UTF-8 before escaping them:

    webView.LoadUrl ("data:text/html;charset=utf-8," + Uri.EscapeDataString (responseContent));
    

    Note that Uri.EscapeDataString() will escape no more than 32766 characters at a time, so if the response is longer than that, you would need to divide it into several smaller strings.

    The trouble with using LoadData() is that it assumes the URL uses "the default US-ASCII charset". The documentation recommends that if you want to load non-ASCII characters, you should create a custom data scheme URI by hand, and pass it to LoadUrl(). Note that currently the Android SDK will let you pass, for example, text/html;charset=iso-8859-1 as the second parameter to LoadData(), but again the docs state that you should use LoadUrl(). So using LoadData() this way is not recommended.

Sign In or Register to comment.