'CLinker.toCString replacement in Java 18
Java 16, as part of incubating package jdk.incubator.foreign
, used to provide convenient way to convert Java String
s to C strings of arbitrary Charset using MemorySegment CLingker.toCString(String str, Charset charset, NativeScope scope)
. That method was removed since Java 17. Is there currently a convenient method to convert Java String
to C string of selected Charset?
Java 18 has void MemorySegment.setUtf8String(long offset, String str)
. However that obviously only supports UTF8.
Solution 1:[1]
I use this snippet to convert strings to UTF-16:
private static MemoryAddress string(String s, ResourceScope scope) {
if (s == null) {
return MemoryAddress.NULL;
}
byte[] data = s.getBytes(StandardCharsets.UTF_16LE);
MemorySegment seg = MemorySegment.allocateNative(data.length + 2, scope);
seg.copyFrom(MemorySegment.ofArray(data));
return seg.address();
}
Note that the tailing null character takes 2 bytes in UTF-16 - if you use a different encoding, you may need to modify the string before (s + '\000'
).
UTF-16 is good enough for my purposes - calling the Windows API.
Solution 2:[2]
On JDK18 I use a conversion of (s+"\0")
which typically adds 1, 2 or 4 bytes as null termination to the end of the MemorySegment
for the C string - depending on the character set used:
static MemorySegment toCString(SegmentAllocator allocator, String s, Charset charset) {
// "==" is OK here as StandardCharsets.UTF_8 == Charset.forName("UTF8")
if (StandardCharsets.UTF_8 == charset)
return allocator.allocateUtf8String(s);
return allocator.allocateArray(ValueLayout.JAVA_BYTE, (s+"\0").getBytes(charset));
}
Windows Java -> Wide string is then: toCString(allocator, s, StandardCharsets.UTF_16LE)
Hopefully someone can offer a more efficient / robust way to convert. The above works for round-trip tests I've done on a small group of character sets (Windows + WSL), but I'm not confident it is reliable in all situations.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Johannes Kuhn |
Solution 2 | DuncG |