diff --git a/src/proto/environment/bytes.mli b/src/proto/environment/bytes.mli index 746bf1c3c..4d9276ce7 100644 --- a/src/proto/environment/bytes.mli +++ b/src/proto/environment/bytes.mli @@ -301,125 +301,6 @@ val equal: t -> t -> bool always-correct {!Bytes.to_string} and {!Bytes.of_string} instead. *) -val unsafe_to_string : bytes -> string -(** Unsafely convert a byte sequence into a string. - - To reason about the use of [unsafe_to_string], it is convenient to - consider an "ownership" discipline. A piece of code that - manipulates some data "owns" it; there are several disjoint ownership - modes, including: - - Unique ownership: the data may be accessed and mutated - - Shared ownership: the data has several owners, that may only - access it, not mutate it. - - Unique ownership is linear: passing the data to another piece of - code means giving up ownership (we cannot write the - data again). A unique owner may decide to make the data shared - (giving up mutation rights on it), but shared data may not become - uniquely-owned again. - - [unsafe_to_string s] can only be used when the caller owns the byte - sequence [s] -- either uniquely or as shared immutable data. The - caller gives up ownership of [s], and gains ownership of the - returned string. - - There are two valid use-cases that respect this ownership - discipline: - - 1. Creating a string by initializing and mutating a byte sequence - that is never changed after initialization is performed. - - {[ -let string_init len f : string = - let s = Bytes.create len in - for i = 0 to len - 1 do Bytes.set s i (f i) done; - Bytes.unsafe_to_string s - ]} - - This function is safe because the byte sequence [s] will never be - accessed or mutated after [unsafe_to_string] is called. The - [string_init] code gives up ownership of [s], and returns the - ownership of the resulting string to its caller. - - Note that it would be unsafe if [s] was passed as an additional - parameter to the function [f] as it could escape this way and be - mutated in the future -- [string_init] would give up ownership of - [s] to pass it to [f], and could not call [unsafe_to_string] - safely. - - We have provided the {!String.init}, {!String.map} and - {!String.mapi} functions to cover most cases of building - new strings. You should prefer those over [to_string] or - [unsafe_to_string] whenever applicable. - - 2. Temporarily giving ownership of a byte sequence to a function - that expects a uniquely owned string and returns ownership back, so - that we can mutate the sequence again after the call ended. - - {[ -let bytes_length (s : bytes) = - String.length (Bytes.unsafe_to_string s) - ]} - - In this use-case, we do not promise that [s] will never be mutated - after the call to [bytes_length s]. The {!String.length} function - temporarily borrows unique ownership of the byte sequence - (and sees it as a [string]), but returns this ownership back to - the caller, which may assume that [s] is still a valid byte - sequence after the call. Note that this is only correct because we - know that {!String.length} does not capture its argument -- it could - escape by a side-channel such as a memoization combinator. - - The caller may not mutate [s] while the string is borrowed (it has - temporarily given up ownership). This affects concurrent programs, - but also higher-order functions: if [String.length] returned - a closure to be called later, [s] should not be mutated until this - closure is fully applied and returns ownership. -*) - -val unsafe_of_string : string -> bytes -(** Unsafely convert a shared string to a byte sequence that should - not be mutated. - - The same ownership discipline that makes [unsafe_to_string] - correct applies to [unsafe_of_string]: you may use it if you were - the owner of the [string] value, and you will own the return - [bytes] in the same mode. - - In practice, unique ownership of string values is extremely - difficult to reason about correctly. You should always assume - strings are shared, never uniquely owned. - - For example, string literals are implicitly shared by the - compiler, so you never uniquely own them. - - {[ -let incorrect = Bytes.unsafe_of_string "hello" -let s = Bytes.of_string "hello" - ]} - - The first declaration is incorrect, because the string literal - ["hello"] could be shared by the compiler with other parts of the - program, and mutating [incorrect] is a bug. You must always use - the second version, which performs a copy and is thus correct. - - Assuming unique ownership of strings that are not string - literals, but are (partly) built from string literals, is also - incorrect. For example, mutating [unsafe_of_string ("foo" ^ s)] - could mutate the shared string ["foo"] -- assuming a rope-like - representation of strings. More generally, functions operating on - strings will assume shared ownership, they do not preserve unique - ownership. It is thus incorrect to assume unique ownership of the - result of [unsafe_of_string]. - - The only case we have reasonable confidence is safe is if the - produced [bytes] is shared -- used as an immutable byte - sequence. This is possibly useful for incremental migration of - low-level programs that manipulate immutable sequences of bytes - (for example {!Marshal.from_bytes}) and previously used the - [string] type for this purpose. -*) - (** Functions reading and writing bytes *) val get_char: t -> int -> char