×

Discussion Board

Page 1 of 2 12 LastLast
Results 1 to 15 of 23
  1. #1
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Specifying InputStream encoding...

    Hi friends,

    I've came across a misunderstanding that i might have about inputStreams. I'm using an inputStream and i'm reading data using: inputStream.read()

    Now if i have a character 'a' (which is 1 bye long) and a character e in french (which is 2 bytes long) the inputStream will still read it normally and i'm not specifying any character encoding anywhere.

    Could anyone explain why this happens ? and maybe if there's a way to specify the character encoding to make sure that data is read sucesfully without any mistakes.

  2. #2
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    Quote Originally Posted by kurteknikk View Post
    Now if i have a character 'a' (which is 1 bye long) and a character e in french (which is 2 bytes long) the inputStream will still read it normally and i'm not specifying any character encoding anywhere.
    This doesn't happen. InputStreams read bytes, not characters.

    How are you converting the bytes to characters?

    For example, if you do:

    Code:
    byte[] data = readBytesFromStream(myInputStream);
    String s = new String(data);
    The constructor for String converts the bytes to characters using the platform default encoding (that's the encoding returned by System.getProperty("microedition.encoding")). You should not rely on this, since it is different on different devices. Code that works fine on your phone can suddenly stop working on someone else's.

    Code:
    String s = new String(data, "UTF-8");
    Creating a String, you can specify the encoding. However, beware that there are few guarantees that a phone will support a particular encoding. MIDP-1 devices are not required to support any specific encoding. For MIDP-2 devices, "UTF-8" is your best bet. Of course, the encoding must match the encoding of the data you're reading!

    Graham.

  3. #3
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    Yes you're right heh the thing is that i'm handling data in a string buffer and then creating a new string:

    String temporaryData = new String(sb);

    But i didn't find a way to set the encoding for the stringbuffer...

  4. #4
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    You know, I could help you so much better if you told me the complete story...

    How does the StringBuffer relate to the InputStream? StringBuffers don't need an encoding, because they don't have a method for appending bytes...

  5. #5
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    This is my code...

    Code:
        public String getRxdData() throws IOException {
            // retrieve the contents of the message from Web server
            StringBuffer sb = new StringBuffer();
            int c = 0;
            while (inputStream.available() > 0) {
                c = inputStream.read();
                    sb.append((char)c);
            }
            String temporaryData = new String(sb); //TODO stringBuffer default encoding is UTF-8
            return temporaryData;
        }
    What i told you in the begining was the since a = 1 byte and (french) e = 2 bytes, inputStream.read() still returns i byte, and i'm casting that byte into a character. So how come i'm reading the 1st byte of the character (french) e, converting it into a character, and then read the 2nd byte and convert it into "another" character :s

    That's why i got mixed up mate...

  6. #6
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    You read one byte, you convert one byte into a character. You will get one character for every byte in the original stream.

    Do you know what the encoding is of the information you're reading? Where is it coming from?

    It's a very bad plan to cast bytes to chars like this.

    You could consider using an InputStreamReader, which is designed for reading streams of characters. However, this will cause problems if you also have non-character data in the same stream. An alternative is to use a ByteArrayOutputStream (rather than a StringBuffer) to accumulate the bytes, then convert the entire byte[] to a String.

    Oh... don't use InputStream.available() like this. It's not for detected the end of a stream. The default implementation all InputStream subclasses inherit always returns zero. This code will break on some phones. InputStream.read() returns -1 at the end of a stream.

    Better still... don't read one byte at a time. It can be very, very slow. Read into a byte[].

    Code:
    public String getRxdData() throws IOException {
        ByteArrayOutputStream bout = new ByteArrayOutputStream();
        byte[] buffer = new byte[BUFFER_SIZE];
        int bufferUsed;
        while ( (bufferUsed = inputStream.read(buffer)) > 0 ) {
            bout.write(buffer, 0, bufferUsed);
        }
        return new String(bout.toByteArray(), "UTF-8");
    }
    512 or 1024 are good values for BUFFER_SIZE.

  7. #7
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    So my first question is: i can use inputStream.available() to check if data is available right ?

    Secondly:
    Do you know what the encoding is of the information you're reading? Where is it coming from?
    The encoding is UTF-8 and the data is coming from an application server so i know what kind of data i should be receiving...

    The default implementation all InputStream subclasses inherit always returns zero. This code will break on some phones.
    But the default implementation of the InputStream doesn't always return zero right ? because as i already told you above i'm using it to check if data available and i'm not having any problems.

    Finally... the problem with your code is that the first time it reads, but since it's a while it tried to read than second time to check that if the number of bytes read is > 0 but since no data will be available, the method will block until data is available... :-/

  8. #8
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    Quote Originally Posted by kurteknikk View Post
    So my first question is: i can use inputStream.available() to check if data is available right ?
    Depends what you mean by "available". Check the specification:

    Returns the number of bytes that can be read (or skipped over) from this input stream without blocking by the next caller of a method for this input stream.
    So, it can return zero even if more data exists.

    Quote Originally Posted by kurteknikk View Post
    But the default implementation of the InputStream doesn't always return zero right ? because as i already told you above i'm using it to check if data available and i'm not having any problems.
    Again, check the specification:

    The available method for class InputStream always returns 0.
    You're not using an instance of InputStream - you can't, since InputStream is abstract. You're using am instance of a subclass, and that subclass has overridden the available method. However, this is implementation dependent. I realize that your code is working fine on whatever you're running it on, but it will break on other implementations.

    Quote Originally Posted by kurteknikk View Post
    Finally... the problem with your code is that the first time it reads, but since it's a while it tried to read than second time to check that if the number of bytes read is > 0 but since no data will be available, the method will block until data is available... :-/
    Did you try it?

    Graham.

  9. #9
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    Did you try it?
    Yes, i tried it and i added a check with inputStream.available() and now it worked. But since you're saying that inputStream.available() doesn't work on all phones, i should find another way to check if more data is available in the inputStream before trying to read, else the method is going to block (as it already did) and before it has data it won't continue (even if it will have more data it will still block when it tried to check again for data).

    Any ideas ?

  10. #10
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    Don't use available()!

    Change:

    Code:
    while ( (bufferUsed = inputStream.read(buffer)) > 0 ) {
    to:

    Code:
    while ( (bufferUsed = inputStream.read(buffer)) > -1 ) {
    What are you testing on?

  11. #11
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    Still it didn't work...

    Quote Originally Posted by grahamhughes View Post
    What are you testing on?
    I'm testing on a Nokia 6120 classic. I think that it can never work because:

    This method blocks until input data is available, end of file is detected, or an exception is thrown.
    That's why i've got an problem to check if data if available, because before doing all this i was trying to check if data if available and then read. So i really need to check if data is available or not preferably without reading it...

  12. #12
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    From what kind of connection does the InputStream come?

  13. #13
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    Quote Originally Posted by grahamhughes View Post
    From what kind of connection does the InputStream come?
    From a socket connection...

  14. #14
    Super Contributor
    Join Date
    Jun 2003
    Location
    Cheshire, UK
    Posts
    7,395

    Re: Specifying InputStream encoding...

    Right. The problem is that the stream doesn't end, as it would if it was an HttpConnection.

    I'd suggest sending a length before transmitting the text, so the client can then receive the correct number of bytes.

    If the server is also Java, then the easiest mechanism might be to use a DataOutputStream on the server and a DataInputStream on the client, as this will make everything easier. writeUTF()/readUTF() will automatically exchange a length and a sequence of bytes, all encoded and decoded for you.

    Graham.

  15. #15
    Regular Contributor
    Join Date
    May 2009
    Posts
    95

    Re: Specifying InputStream encoding...

    Actually the server side is php... so i assume that writeUTF() can't be used... :-/

    So there's no other way to check the length of data in the inputStream, maybe i can override the available() method i don't know, because i really need it and it's not nice to send the length before transmitting the text...

    And btw, write/read UTF isn't possible for data which is not characters right ? Because the system will be modified later on to handle data which is not characters too. And i don't want to find a solution with something like read/writeUTF and then i'll have to change everything later on.

    The only idea that i have in mind is reading byte by byte :-/ because that way if i try to read the next byte it will return -1 and i can set that condition in the while loop.

    But still i can't check if data is available without reading !!

Similar Threads

  1. InputStream issues
    By pauspling in forum Mobile Java Networking & Messaging & Security
    Replies: 3
    Last Post: 2008-04-23, 09:16
  2. What happens when play media from InputStream?
    By Molewy in forum Mobile Java Media (Graphics & Sounds)
    Replies: 3
    Last Post: 2007-04-10, 05:49
  3. Audio Recording
    By younker in forum Mobile Java Media (Graphics & Sounds)
    Replies: 4
    Last Post: 2007-01-20, 20:02
  4. 7-bit Encoding for Multi-part SMS
    By yurrea in forum General Messaging
    Replies: 2
    Last Post: 2004-01-15, 18:55
  5. Charset encoding of outgoing E-mail messages
    By omerz in forum Symbian Networking & Messaging (Closed)
    Replies: 0
    Last Post: 2002-05-09, 11:13

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
×